Natural Language Processing Overview

자연어처리(Natural Language Processing)의 큰 그림을 그려보자.

Background

Hypothesis

분산 가설(distributional hypothesis)

You shall know a word by the company it keeps

Firth, 1957 (Studies in Linguistic Analysis)

Meanings of words are (largely) determined by their distributional patterns (Distributional Hypothesis)

Harris, 1968 (Mathematical Structures of Language)

Words that occur in similar contexts will have similar meanings (Strong Contextual Hypothesis)

Miller and Charles, 1991 (Language and Cognitive Processes)

Various extensions…

Ref

Contexts

기준이 되는 string 단위

windows(n size), 문장, 문단, 문서 etc

Lexical Features

N-gram

Dictionary based Tokenization

Unsupervised Segmentation

Unsupervised Segmentation

Co-occurrences


Models

Bag of words

Word Weighting

Vector Space Model

Context representations

first order vector

Term-Document Matrix

second order vector

Term-Co-occurence Matrix

Dimensionality Reduction (차원축소)

SVD (Singular Value Decomposition) and LSA (Latent semantic analysis)
MDS (Multi-Dimensional Scaling)
PCA (principal component analysis), unsupervised learning
ICA (Independent Components Analysis)
LDA (Linear Discriminant Analysis, Fisher’s LDA)
LDA (Latent Dirichelt Allocation)

Similarity

Measuring Similarity

Distance

Generative model


Semantics

Word Embedding

Word Embedding

Sequence-to-Sequence


Applications

Collocations

Collocations

Topic Modeling

Comparing Corpuses


Measures

Measures of Association

Probability

T-score

Z-score

Chi-Square Statistic (χ2)

관찰값과 기대값 사이의 거리(Distance)

\[\chi^2=\sum_{k=1}^{n} \frac{(O_k - E_k)^2}{E_k}\]

log-likelihood ratio G2

Information Theory

Entropy

KL divergence

MI (Mutual information)


Open problems

Word-sense disambiguation


Visualization

Data Visualization


Lectures

자연어처리 강의들

... ... ... ...
Back