Interaction without the main effects

I’m a bit stuck in a question among appliers without some clever statistician to guide us to the theoretical reasoning. Many people I’ve talked to state that it is fundamentally wrong to consider a model including an interaction without the main effects, whereas I have thought this is true basically when the modeling is based on hypotheses testing or likelihood ratios (since to my understanding these methods use the variance components of the main effects in estimating the coefficient of the interaction term). Am I a way out here, or is it statictically correct, in principle, to take interaction without main effects as a valid parameterization in information theoretic model selection (via maximum likelihood estimation)?
Furthermore,
1) In case it is statistically correct and also biologically reasonable to hypothesize (a*b), when is it really necessary? Comparing (a + b + a*b) and (a + b) already provides information on the relative importance of the interaction. If the main effects are a priori thought to be possibly meaningless, whereas the interaction would make biological sense, should one also include (a*b)?
(This model would have smaller K, moreover would the difference in the structure of the model as a whole matter?)
2) Vice versa, say, if b is a priori suspected not to play a biologically important role independently of a, what is the reason to exclude model (a + a*b) ?
3) Does it matter if other parameters are also involved? For example, comparing models:
(a + b + c)
(a + b + c + a*b)
(c + a*b)
I would appriciate anybody revealing to me the general reasoning for a model including an interaction without the main effects to be a stupid one
or discussing the use of this kind of parametrization.
Best regards, Miina
Furthermore,
1) In case it is statistically correct and also biologically reasonable to hypothesize (a*b), when is it really necessary? Comparing (a + b + a*b) and (a + b) already provides information on the relative importance of the interaction. If the main effects are a priori thought to be possibly meaningless, whereas the interaction would make biological sense, should one also include (a*b)?
(This model would have smaller K, moreover would the difference in the structure of the model as a whole matter?)
2) Vice versa, say, if b is a priori suspected not to play a biologically important role independently of a, what is the reason to exclude model (a + a*b) ?
3) Does it matter if other parameters are also involved? For example, comparing models:
(a + b + c)
(a + b + c + a*b)
(c + a*b)
I would appriciate anybody revealing to me the general reasoning for a model including an interaction without the main effects to be a stupid one

Best regards, Miina