This notes is for using AI services and tools. For working with APIs, please refer this note. For tools and notes in Vibe Coding, refer this note.
- CLIProxyAPI β Wrap Gemini CLI, Antigravity, ChatGPT Codex, Claude Code, Qwen Code, iFlow as an OpenAI/Gemini/Claude/Codex compatible API service, allowing you to enjoy the free Gemini 2.5 Pro, GPT 5, Claude, Qwen model through API
This is based on my personal experience with the current versions. The list may change significantly in the future.
TL;RD: Coding with Claude Code. Others with Gemini. Thatβs it!
- Summarize YouTube videos: Notebook LM or ask Grok with a URL.
- Check references/sources: Perplexity then
GrokGemini / ChatGPT.
- Record and summarize live meetings: ChatGPT Pro.
- All-in-one chatbot models: Monica (affordable option).
- Work with personal files/sources: Use Project or Spaces features in AI services and upload your resources.
- Voice Mode (for English speaking practice): Gemini,
ChatGPT(has memory),Grok(for creative conversations).
- AI IDE: (Updated: whatever with Claude Code extension) Cursor, then VSCode with Github Copilot. Both use Claude models.
- Image editing/generation: Gemini Banana.
- Generate new photo based on current photo: Gemini Banana (with a good prompt)
- Replace clothes: Grok or Gemini Banana.
- Video generation (photo to video): Grok Imagine.
- Check the model list on the home page.
- To download model (cannot do with the desktop app):
ollama pull <model_name> - Models I use:
qwen3:8b(for tasks need a quick response),qwen3-coder:30b(to use with claude code on my Mac M4),gpt-oss:20b(daily chat).
- Run with API endpoints, check this official doc.
If you wanna use Claude Code with local AIs or enable web (more easily), use Ollama instead.
- Need to enable server before using endpoints (β οΈΒ Make sure to enable CORS)
- No need to load the model before using request to endpoints, it will be loaded automatically.
- Example of curl
1curl http://127.0.0.1:1234/v1/chat/completions \
2 -H "Content-Type: application/json" \
3 -d '{
4 "model": "google/gemma-3-27b", // or: openai/gpt-oss-20b
5 "messages": [
6 { "role": "system", "content": "Always answer in rhymes." },
7 { "role": "user", "content": "Introduce yourself." }
8 ],
9 "temperature": 0.7,
10 "max_tokens": -1,
11 "stream": true
12 }'- In your IDE (VSCode or Cursor), install the Continue extension.
- In LM Studio, navigate to the Developer tab, select your downloaded model β Settings β enable "Serve on Local Network" β enable the server.
- In your IDE, select the "Continue" tab on the left sidebar β Choose "Or, configure your own model" β "Click here to view more providers" (or select the Ollama icon tab if you're using Ollama) β in the provider list, select LM Studio β Set Model to "Autodetect" β Connect β a config file will open at
~/.continue/config.yaml, keep the default settings and save.
- That's it!
- As another option, you can use Granite.code (from IBM)
Iβm using Claude Code, if you use another Coding CLI service, modify the codes. Insert below codes in
.bashrc or .zshrc and then source ~/.zshrc:1claude_execute() {
2 emulate -L zsh
3 setopt NO_GLOB
4 local query="$*"
5 local prompt="You are a command line expert. The user wants to run a command but they don't know how. Here is what they asked: ${query}. Return ONLY the exact shell command needed. Do not prepend with an explanation, no markdown, no code blocks - just return the raw command you think will solve their query."
6 local cmd
7 # use Claude Code
8 cmd=$(claude --dangerously-skip-permissions --disallowedTools "Bash(*)" --model default -p "$prompt" --output-format text | tr -d '\000-\037' | sed 's/^[[:space:]]*//;s/[[:space:]]*$//')
9 if [[ -z "$cmd" ]]; then
10 echo "claude_execute: No command found"
11 return 1
12 fi
13 echo -e "$ \033[0;36m$cmd\033[0m"
14 eval "$cmd"
15}
16alias ask="noglob claude_execute"1# Usage
2ask "List all conda env in this computer"