Step 11 — Building a Simple NLP Analysis App with Qt and Matplotlib in Python
Introduction
Having learned the fundamentals of text cleaning, tokenization, and word frequency analysis, let’s take your skills to the next level by building a GUI application. Using the Qt framework and Matplotlib, you’ll create a two-pane app: one side for text input, the other for analysis and a frequency plot. This app consolidates many steps from your journey—practical NLP, visualization, and interactive Python coding.
Main concept explained clearly
Qt for GUI: Qt is a powerful framework for building desktop applications with Python via PyQt5 or PySide6. You can easily make input areas, buttons, and result displays.
Matplotlib embedding: Matplotlib can render plots directly inside Qt widgets, providing visual feedback for word frequency.
Workflow:
- Left pane: Paste/type any text.
- Button: “Process” triggers the NLP pipeline.
- Right pane: Shows tokens, word frequencies, and a Matplotlib plot.
Why this matters in NLP
- Makes real analysis more accessible and interactive.
- Encourages experimentation with different texts.
- Gives instant feedback and visualization for NLP insights.
- Bridges the gap from scripting to real applications.
Python example
Install prerequisites:
pip install pyqt5 matplotlib
Minimal Qt app covering all major previous steps:
“`python name=step11_nlp_qt_app.py
import sys
import string
from PyQt5.QtWidgets import QApplication, QWidget, QVBoxLayout, QHBoxLayout, QTextEdit, QPushButton, QLabel, QTabWidget
from PyQt5.QtCore import Qt
import matplotlib.pyplot as plt
from matplotlib.backends.backend_qt5agg import FigureCanvasQTAgg as FigureCanvas
class NLPApp(QWidget):
def init(self):
super().init()
self.setWindowTitle(“Python NLP Analyzer”)
self.setGeometry(100, 100, 900, 500)
layout = QHBoxLayout(self)
# Left: Input
left_layout = QVBoxLayout()
self.input_box = QTextEdit()
self.input_box.setPlaceholderText("Paste your text here...")
left_layout.addWidget(QLabel("Input Text"))
left_layout.addWidget(self.input_box)
self.process_btn = QPushButton("Process")
left_layout.addWidget(self.process_btn)
layout.addLayout(left_layout)
# Right: Tabs for Results
self.tabs = QTabWidget()
self.token_tab = QTextEdit()
self.token_tab.setReadOnly(True)
self.freq_tab = QTextEdit()
self.freq_tab.setReadOnly(True)
self.graph_tab = QWidget()
self.tabs.addTab(self.token_tab, "Tokens & Frequencies")
self.tabs.addTab(self.freq_tab, "Analysis")
self.tabs.addTab(self.graph_tab, "Graph")
layout.addWidget(self.tabs)
self.process_btn.clicked.connect(self.process_text)
# Graph elements
self.figure = plt.figure(figsize=(5,4))
self.canvas = FigureCanvas(self.figure)
graph_layout = QVBoxLayout(self.graph_tab)
graph_layout.addWidget(self.canvas)
def process_text(self):
text = self.input_box.toPlainText()
if not text.strip():
self.token_tab.setPlainText("No input detected.")
self.freq_tab.setPlainText("No analysis.")
self.figure.clf()
self.canvas.draw()
return
# Clean and tokenize
clean = text.strip().lower()
translator = str.maketrans('', '', string.punctuation)
no_punct = clean.translate(translator)
tokens = no_punct.split()
stop_words = set([
'the','is','in','it','by','and','a','of','to','for','on',
'o','a','de','em','para','with','as','an','at','this','that'
])
filtered = [w for w in tokens if w not in stop_words]
# Word Frequency
word_freq = {}
for w in filtered:
word_freq[w] = word_freq.get(w, 0) + 1
# Display tokens & frequencies
self.token_tab.setPlainText(
f"Tokens (stopwords removed):\n{filtered}\n\nFrequencies:\n{word_freq}"
)
# Analysis: Top N words
sorted_words = sorted(word_freq.items(), key=lambda item: item[1], reverse=True)
analysis = ""
top_n = 8
for i, (word, count) in enumerate(sorted_words[:top_n]):
analysis += f"{i+1}. {word}: {count}\n"
analysis += "\nTry different inputs for varied results."
self.freq_tab.setPlainText(analysis)
# Bar plot (Matplotlib)
self.figure.clf()
if sorted_words:
words = [w for w,c in sorted_words[:top_n]]
counts = [c for w,c in sorted_words[:top_n]]
ax = self.figure.add_subplot(111)
ax.bar(words, counts, color='skyblue')
ax.set_xlabel('Word')
ax.set_ylabel('Frequency')
ax.set_title(f"Top {top_n} Words")
self.figure.tight_layout()
self.canvas.draw()
if name == “main“:
app = QApplication(sys.argv)
win = NLPApp()
win.show()
sys.exit(app.exec_())
“`
Line-by-line explanation of the code
- Import PyQt5, Matplotlib, and text processing tools.
NLPAppclass builds the GUI:- Left pane is for input text and processing button.
- Right pane includes three tabs: tokens/frequencies, analysis, and plot.
- On clicking “Process”:
- Text is cleaned, lowercased, punctuation removed.
- Split into tokens, filtered for stopwords.
- Word frequencies counted.
- Results shown in different tabs.
- Bar plot displayed with Matplotlib.
- The GUI is user-friendly for beginners, and extensible for future steps!
Practical notes
- You can quickly adapt stopword lists for other languages or richer analysis.
- This app can serve as a foundation for more complex NLP tasks: sentiment, named entity recognition, and more.
- If you want, you can switch tabs to see frequencies or the bar chart.
- PyQt5 is cross-platform; the app works on Windows, Linux, and Mac.
Suggested mini exercise
- Paste a text in Portuguese or Spanish, and add stopwords to observe the effect.
- Try different genres of text—news, tech, poetry—and compare token and frequency results.
- Modify the app to show bigrams or trigrams in a new tab.
Conclusion
You have now consolidated your NLP core skills into a simple, interactive app using Python, Qt, and Matplotlib. The application is a practical tool for instant analysis and visualization, demonstrating your ability to turn scripts into real desktop utilities. From here, you can expand and specialize your app to tackle broader NLP tasks with ease.
