Methodologies and Architectures for AI Inference Hardware: From Foundational Networks to Large Language Models