Take the language \left\{a^n \mid n \in \mathbb{N}\right\}, consisting of n copies of the symbol a. This is a regular language, and regular languages can be recognized in O\left(n\right), linear time.
Take the language \left\{a^n b^n \mid n \in \mathbb{N}\right\}. This is no longer regular, but it is context-free, and context-free languages can be recognized in O\left(n^3\right), cubic time. However, this particular language can be recognized in linear time.
Take the language \left\{a^n b^n c^n \mid n \in \mathbb{N}\right\}. This is no longer context-free, but it is something called 2-MCFGwn, and those languages can be recognized in O\left(n^6\right). But even this language is still trivial to recognize in linear time!
What’s going on here? These are canonical examples of their respective classes! If they’re so easy to recognize, how hard can it be to recognize other languages in their class?
I’ll start with a little bit of background, but then quickly get into the meat of this post, which answers some questions of the shape “how hard to parse can the languages of this particular grammar formalism get?”.
The k-Clique problem
There’s a famously hard problem, the k-Clique problem (“does this graph contain a clique of size k?”), about which it is conjectured that the lowest exponent for it is \omega (or just straight up 3 if you exclude “algorithms like fast matrix multiplication”, but this is an informal notion). This is the k-Clique hypothesis, and from what I can gather, it’s more towards the \text{P} \neq \text{NP} end of the spectrum when it comes to hypotheses in theoretical computer science.
CFGs
CFGs are a very commonly used class of grammars, for which the best known parsers are in the cubic-ish range of runtimes, specifically O\left(n^3\right) (more specifically O\left(n^3 / \left(\log n\right)^2\right), but the polynomial exponent is the important thing here) without fast matrix multiplication, and O\left(n^\omega\right) with it (where \omega is the matrix multiplication exponent).
But those are upper bounds on the complexity, not lower. Are there CFGs that actually take that sort of cubic-ish time to parse? Turns out, if you believe the k-Clique hypothesis, then yes! This paper gives a CFG (of fixed size) and a way to encode a graph into a string (in time linear in string length) such that the language of the CFG contains that string if and only if the string describes a graph that contains a 3k-clique (that is, a clique of a size that’s a multiple of 3). According to the k-Clique hypothesis, this language cannot be recognized in time O\left(n^{\omega-\varepsilon}\right) for any \varepsilon > 0, or, in other words, in polynomial time with an exponent lower than \omega. If you won’t use things like fast matrix multiplication, the bound is instead 3.
So there we have it, proof that if we pick CFG as our grammar formalism of choice, we are theoretically at risk of ending up with grammars that take pretty much cubic time to parse.
Mildly context-sensitive grammars
There’s just one thing… CFGs aren’t actually expressive enough to capture the entirety of human language. There are languages with structures that require a more expressive grammar, as examined in this article (PDF here). Specifically, Swiss German, which has a construction where you end up with a sequence of NPs with dative or accusative case, followed by a sequence of verbs that require those cases, in the same order as the NPs. We can abstract this into the “copy” language, which consists of two copies of the same string. This language is not context-free.
The temptation is to model this with the class of context-sensitive grammars, but that is far too expressive and leads to huge computational complexity. So we want a weaker formalism, and that’s where mildly context-sensitive grammars come in. One of the more well-studied of these is tree-adjoining grammars (TAG), which can handle the language \left\{ a^n b^n c^n d^n \mid n \in \mathbb{N}\right\} and the copy-language of strings repeated 3 times, but not \left\{ a^n b^n c^n d^n e^n \mid n \in \mathbb{N}\right\} or the copy-language of strings repeated 4 times. Another grammar formalism that isn’t quite as powerful as general context-sensitive grammars is minimalist grammars, formalizations of the sort of syntactical theories used in the minimalist program, the ones with movement you may be familiar with. There’s also multiple context-free grammars, for which there is a good introduction here (PDF). An MCFG with rank r and dimension d is called a d-MCFG(r), and a d-MCFGwn(r) if it’s well-nested. For well-nested MCFG and d,r \geq 2, d-MCFGwn(r) is equivalent to d-MCFGwn(2), so we can forget about the rank and just write d-MCFGwn. MCFG are quite a general formalism: general MCFG are equivalent to minimalist grammars, an extension called PMCFG (parallel MCFG) are equivalent to minimalist grammars with copying, and in fact 2-MCFGwn, which is in some sense the least complex MCFG, is equivalent to TAG! A nice property of MCFG is that you can parse any d-MCFG(r) grammar in time O\left(n^{d\left(r+1\right)}\right).
So how hard can parsing 2-MCFGwn get, then? From the bound above, we can parse 2-MCFGwn (and thus TAG) in O\left(n^6\right) time, and there are in fact TAG parsers with that performance, and even ones that run in O\left(n^{2\omega}\right) time. But as before, that’s just an upper bound, is there a lower bound? Using a very similar technique to the paper about CFGs, this paper constructs a TAG grammar for a language that contains strings that encode graphs iff those graphs have 6k-cliques, and uses it to prove that if you believe the k-Clique hypothesis, TAGs (and by extension 2-MCFGwn) cannot be parsed in a time with a polynomial exponent lower than 2\omega, or, if you won’t use things like fast matrix multiplication, 6.
Thus, if you use TAGs, MCFG, or minimalist grammars, you run the theoretical risk of pretty much O\left(n^6\right) time complexity to parse your language!
Conclusion
Based on a pretty solid conjecture, the upper bounds for our parsing algorithms for context-free and some of the weakest but still context-sensitive grammars are essentially also the lower bounds, at least when it comes to the polynomial degree. Luckily, for most practical grammars the situation isn’t nearly as bad, in fact often parsing in close to linear time, but we know for sure that it is possible to trigger this higher complexity as long as you work within unrestricted versions of these formalisms, which is a bit scary.
Why mention “unrestricted”? Because, at least for CFGs, there are huge families of them that have been proven to be parseable with linear complexity. This is where we get our LR(k) and LALR parsers and so on from. So that’s an option if you want to avoid the spooky cubic monster.
What about the even more expressive formalisms? Higher dimensioned MCFGwn grammars like 3-MCFGwn, or not-necessarily-well-ordered MCFG and minimalist grammars, or even PMCFG and minimalist grammars with copying? What sort of lower bounds do we have for those? To my knowledge, that’s an open question! Exciting, but also a little scary. I wouldn’t be surprised if this k-Clique technique can be extended to push any minimalist grammar or MCFG to its limits.
EDIT: changed the wording a little to reflect that the languages are much bigger than only the proofs-of-clique, see followup