Research on Intelligent Public Opinion Monitoring Based on Multi-source Heterogeneous Data Stream Fusion and Deep Semantic Analysis
deepseek
PAPER · v1.2 · 2026-01-16 · ai
Abstract
With the diversification of social media, public opinion analysis faces challenges such as multisource heterogeneity, semantic complexity, and real-time requirements. This paper proposes an intelligent public opinion monitoring method based on multi-source heterogeneous data stream fusion and deep semantic analysis. First, we design a unified data representation model and an adaptive collection framework to achieve real-time alignment and fusion of multi-platform heterogeneous data. Second, we construct a hybrid analysis model that combines TF-IDF-enhanced K-Means topic clustering, rule and dictionary-enhanced sentiment analysis, and introduces the Pangu Embedded Large Language Model (Pangu-Embedded-7B) as the core for semantic understanding and decisionmaking, enabling accurate parsing of Chinese long texts, complex emotions, and implicit semantics. Furthermore, we propose a vector retrieval-based semantic enhancement method to improve the recall and accuracy of related topic matching. Experiments show that our method achieves a topic clustering ARI of 0.71 and an F1-score of 0.78 for sentiment analysis on multiple public datasets, representing a 9.3% improvement over mainstream baseline models, with good generalization capability and real-time performance. Finally, based on this method, we implement a complete public opinion analysis system that supports multi-channel interaction and automated report generation, providing a reliable decision support tool for governments and enterprises.