Define Uni-gram, bi-grams and N-grams in detail. What are N-grams useful for? Also explain hidden markov model in detail
“I love reading blogs about data science on Analytics Vidhya.”
A 1-gram (or unigram) is a one-word sequence. For the above sentence, the unigrams would simply be: “I”, “love”, “reading”, “blogs”, “about”, “data”, “science”, “on”, “Analytics”, “Vidhya”.
A 2-gram (or bigram) is a two-word sequence of words, like “I love”, “love reading”, or “Analytics Vidhya”. And a 3-gram (or trigram) is a three-word sequence of words like “I love reading”, “about data science” or “on Analytics Vidhya”.
An N-gram language model predicts the probability of a given N-gram within any sequence of words in the language. If we have a good N-gram model, we can predict p(w | h) – what is the probability of seeing the word w given a history of previous words h – where the history contains n-1 words.
Two benefits of n-gram models (and algorithms that use them) are simplicity and scalability – with larger n, a model can store more context with a well-understood space–time tradeoff, enabling small experiments to scale up efficiently.
Hidden Markov Model
Consider weather, stock prices, DNA sequence, human speech or words in a sentence. In all these cases, current state is influenced by one or more previous states. Moreover, often we can observe the effect but not the underlying cause that remains hidden from the observer. Hidden Markov Model (HMM) helps us figure out the most probable hidden state given an observation. In practice, we use a sequence of observations to estimate the sequence of hidden states. In HMM, the next state depends only on the current state. As such, it's good for modelling time series data.
Comments
Leave a comment