Open navigation

Go to Reddit Home

Log in to Reddit

Open settings menu

Advertise on Reddit

Shop Collectible Avatars

Get the Reddit app

Scan this QR code to download the app now

Or check it out in the app stores

r/LocalLLaMA

members

online

Hot

Open sort options

Change post view

From Clément Delangue on X: Hugging Face is profitable these days with 220 team members

•

From Clément Delangue on X: Hugging Face is profitable these days with 220 team members

https://x.com/ClementDelangue/status/1811675386368966682

u/SamsungMobileUS

•

Promoted

Hello! Bonjour! Ciao! Now you're speaking their language thanks to Interpreter with Galaxy AI. Talk the talk with real-time translations you both can see when you pre-order the new Galaxy Z Flip6 and get double the storage on us.

samsung.com

Learn More

New Model CodeGeeX4-ALL-9B has been released.

•

New Model CodeGeeX4-ALL-9B has been released.

We introduce CodeGeeX4-ALL-9B, the open-source version of the latest CodeGeeX4 model series. It is a multilingual code generation model continually trained on the , significantly enhancing its code generation capabilities. Using a single CodeGeeX4-ALL-9B model, it can support comprehensive functions such as code completion and generation, code interpreter, web search, function call, repository-level code Q&A, covering various scenarios of software development. CodeGeeX4-ALL-9B has achieved highly competitive performance on public benchmarks, such as and . It is currently the most powerful code generation model with less than 10B parameters, even surpassing much larger general-purpose models, achieving the best balance in terms of inference speed and model performance.

We have been developing a Local AI to create a decentralized developer kit (DDK). We've alternated between using CodeQwen for its speed, larger context window, and DeepSeek for its higher accuracy but very limited context window, which is crucial for code generation within the existing code context.

But wow! We just tested CodeGeeX4-ALL-9B (Quantized 3 bit). It accepts more than 1500 tokens as context, evaluates prompts at 172.99 tokens/s, and generates more than 25 tokens/second on my M1 machine while consuming 4.5 GB of RAM.

The future of all personal AI is local, not cloud-based. We all have the machines to make that possible; we just need to look around!

GGUF
If you are facing issues with llama.cpp, just update it to the latest release.

WizardLM 3 is coming soon 👀🔥

•

WizardLM 3 is coming soon 👀🔥

Top 1% Rank by size

Rules

1

Please search before asking

Before submitting a post to ask a question, please search this subreddit and related resources.

To maintain community quality, questions that fall under Rule 3 (Low Effort Posts) may be removed. This includes questions like:

How to use Llama?
Where to download models?
What models can I run with this hardware?
Help! I'm having [insert error message here]

Questions that cannot be found by searching are always allowed. If you need help with PC building and budgeting, try r/buildapc

2

Off-Topic Posts

Posts must be directly related to Llama or the topic of LLMs.

3

Low Effort Posts

Asking questions is allowed, but it's kindly asked that users first spend a reasonable amount of time searching for existing questions on this subreddit or elsewhere that may provide an answer.

Since this subreddit receives a high volume of questions daily, this rule is in place to help reduce identical posts.

If you can't find an answer to your question and want to make a post here, please be clear and comprehensive when asking to improve your chances of getting an answer from the community.

4

Limit Self-Promotion

This is an open community that highly encourages collaborative resource sharing, but self-promotion should be limited. The 1/10th rule is a good guideline: self-promotion should not be more than 10% of your content.

Additionally, if you are sharing your project:

Please do not use any sensationalized titles.
Do not use any affiliate links when linking to content. Links must be directly to the source, such as GitHub or Hugging Face.
Do not flair your own project as News. Use Resources.

5

Follow Reddit's Content Policy

Posters and commenters are expected to act in good faith. Treat other users the way you want to be treated. Avoid straw-manning and bad-faith interpretations. Avoid presenting misinformation as factual. Please remember to follow Reddit's Content Policy (https://www.redditinc.com/policies/content-policy).

Promoted

Promoted

sidebar promoted post thumbnail