- Published on
Fast Facts on AI-Language Models
- Authors
- Name
- Rakesh Tembhurne
- @tembhurnerakesh
It's been a decent time since AI is out there, we can start picking the right tools for the right jobs. I have tried multiple tools and services, and it's time I gather all the learnings to make an informed choice for future projects.
Which AI Model To Choose
ChatGPT
ChatGPT is a defacto option that comes to mind. I specifically come back to it when I need an 'intelligent answer'. Text output is given by any of the LLM models, but it needs a special way of building which I don't think any other AI service has achieved yet.
I am also been using OpenAI APIs to build tools and services, also using their text-to-voice service heavily.
Gemini
Gemini is my primary choice, but when I am not satisfied with it, I switch back to ChatGPT. The main reason I opt for Gemini is the way it formats the data. I think Google has been working hard on setting examples of how an article (or content) should be formatted for humans. I trust more in them, as they have been suggesting the world the same, through their Search Engine Optimization suggestions.
LLama
Llama 3 is my first option when it comes to choosing an open-source self-hosted AI model. Recently they came up with version 3, which is a substantial improvement over other things.
Other Models
Although I tried, various models like Mistral, Mixtral, Phi 3, and so on, I didn't find it more useful than any of the above three. So, why switch if they are nothing more than just a copy of the things you do? Unless there is a very specific purpose for the model, and it is great at that, there is no reason to choose that model.
LLM Tools and Services
Ollama
Out of multiple options to host LLMs locally, I prefer Ollama, as it is very straightforward, works with CLI and provides API to develop apps that can be run locally and offline.
Groq
I have been following Groq, ever since they made headlines with their new Chip.
Groq provides a very fast API. They have introduced a very different kind of Chip, which enables them to run the same LLMs, but at a very fast speed compared to Graphics Cards and CPUs. They host Open Source LLMs and provide them with API service.
For now, they haven't started charging yet, so they are providing APIs for free. It's a good time to take advantage of their service.
Vercel AI SDK
Using Vercel for hosting apps is common nowadays. Vercel also provides Vercel AI which is a library for building AI-powered streaming text and chat UIs. It supports modern frameworks like Next, Swelte, etc.
To be honest, I tried Vercel AI's Chat API and Chat UI as well. However, I did not choose to use Chat UI, as it has a typical interface which is common everywhere. It is good to get started but I haven't found a good use case for using Chat Interface. But the Chat API was very useful to me as it saved me a lot of time developing some smaller things which take a lot of development time.
Azure OpenAI
Here, we get OpenAI tools as a service. You can build your copilot and generative AI applications. It claims to connect your data, and call functions, and improve workflow with language and image models. I am yet to try this service, so more on it later.
Hugging Face
Hugging Face is a platform where the machine-learning community collaborates on models, datasets, and applications. Hugging Face has been primarily there for searching models which are built by not-so-common people with not-so-common use cases. People with an interest in deep levels of learning, and creativity will be present on Hugging Face.
Together AI
Together AI is a cloud platform for building and running generative AI. I liked the pricing of this service, which is very straightforward forward like $0.2 per million tokens. I have not yet used this service, so more on this later.
LLM Training Methods
- Odds Ratio Preference Optimization (ORPO): Coming fresh out of South Korea, a team of researchers has presented a new training method for Large Language Models, named Odds Ratio Preference Optimization (ORPO), that offers increased efficiency in terms of computation and, importantly, seems to create better-performant models.