Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

marketing: GTC 2025 #80

Open
eckartal opened this issue Sep 19, 2024 · 1 comment
Open

marketing: GTC 2025 #80

eckartal opened this issue Sep 19, 2024 · 1 comment

Comments

@eckartal
Copy link
Contributor

Overall

GTC 2025 will be held on March 17- 20, 2025 in person in San Jose. The NVIDIA team wants us to share our works there.

At that time, hope we integrate Jade to Jan which is powered by Cortex enable Jan flexible enough to run any hardware. It's like puzzling and gathering puzzle pieces together to solve personal AI assistant problems following LLMs, hardware, UI, a small OS, assistant, and voice assistant.

Goals

[ ] Find a topic to share
[ ] Prepare a submission fits NVIDIA's reqs - including title, description and key findings
[ ] Submit!

Ideas

We should positionate ourselves as a thought leader in AI dev by sharing our findings providing timelines and full-cycle starting from solved problems to future challenges.

  • Covers entire AI assistant development lifecycle
  • Highlights interdependencies between hardware, software, and UI - covering Jan, Cortex and Jade
  • Demonstrates holistic-engineering thinking in tech development

So this talk should be about the recap of our full-cycle development journey.

Flow

I think we should have two sections focused on problems solved and tomorrow's problems to solve, explaining what we've done to build a full-stack product-related company starting from LLMs problems to hardware, to voice assistants showcasing Jan as a UI, Cortex as an engine, Jade as a voice assistant and finally combining all of them to build real-life Jarvis that can communicate well, understand human speech and talk back!

Section 1: Building the Foundation

  • LLM advancement (Models that communicate well)
  • Hardware evolution (focusing NVIDIA)
  • Software optimization (libraries, accelerators to run AI easily and of course Cortex)
  • UI (giving a simple UI to chat with AI, focusing on our learnings in Jan)
  • Voice assistant integration, explaining our research effort for Jan
  • System integration on how to integrate with Jan, Cortex and Jade

Section 2: Tomorrow's Problems

  • Improving problem-solving skills touching CoT, human-like reasoning, cognitive advancements
  • Integration issues and how to make AI small - making dummy devices smart by sharing our vision for ubiquitous AI assistance
  • Physical Embodiment... robots!

Key Milestones

  • Complete Jade-Jan integration
  • Improve Cortex compatibility for cross-hardware flexibility
  • Prepare presentation materials (puzzle-like problem-solving)

Talk Submission

Title

Building "Jarvis" through puzzle-like problem solving

Description

This talk covers the development of AI assistants capable of solving complex problems like puzzles. We'll discuss the technologies that make these assistants possible, including advanced language models, specialized hardware, and intuitive interfaces. The presentation will highlight current capabilities and future challenges in AI assistant development, with a focus on NVIDIA's contributions to the field.

This talk covers the full lifecycle of AI assistant development, structured in two parts. First, we cover solved problems: LLM advancements, hardware optimization, software integration, UI, and voice recognition. Second, we address future challenges: enhancing AI problem-solving, achieving device integration, and exploring physical embodiment. We'll share our learnings on building real-life "Jarvis"ish personal assistant that runs 100% offline on from personal laptops to datacenter-level computers, to demonstrate how these elements combine to create AI assistants capable of natural communication and speech understanding.

Key Takeaways

  • Understand a full-cycle development approach in creating a Jarvis-like AI assistant, from LLM advancements to system integration
  • Learn how NVIDIA's hardware innovations, coupled with our Cortex engine, enable flexible and powerful AI processing
  • Learn about the evolution of AI interfaces, from text to multimodal interactions, and their integration into everyday systems
  • Discover the synergy between Jan (UI), Cortex (engine), and Jade (voice), creating a seamless AI assistant experience
  • Explore our vision for future AI development, including advanced problem-solving, ubiquitous integration, and physical embodiment

Resources

@eckartal
Copy link
Contributor Author

eckartal commented Sep 20, 2024

We've changed the topic.

Title

Running TensorRT-LLM on 10,000 RTX Machines: What We've Learned

Description

In this talk, we'll detail our experience with TensorRT-LLM through the lens of our desktop application, which has enabled over 10,000 RTX machine users to run AI models locally. As specialists in desktop inference, we've experienced significant edge cases on implementing AI at scale using consumer-grade hardware. We'll present our findings on the technical challenges, performance metrics, and key insights gained from this large-scale deployment, offering a data-driven perspective on AI implementation for individual users and teams.

Key Takeaway

  • Overview of our desktop application leveraging TensorRT-LLM for local AI model execution
  • Performance analysis on consumer-grade RTX GPUs
  • Quantitative data and benchmarks from our 10,000-machine deployment
  • Evidence-based best practices for the hobbyist and consumer market
  • Insights and best practices for AI implementation on from consumer to datacenter-grade hardware at scale

Relavence to NVIDIA

This presentation aligns with NVIDIA's goal of expanding TensorRT-LLM adoption. By providing concrete data on its implementation at scale in the consumer market, we demonstrate the tool's applicability beyond enterprise use cases. Our technical insights could inform NVIDIA's development roadmap and potentially accelerate adoption among individual developers and teams.

Speaker Bio

Daniel Ong is the CEO of Homebrew Research, an AI R&D studio working on local AI, small language models, and multi-modality. He started off his career as an engineer at Palantir and Pivotal Labs. He studied Computer Science at Stanford '12. Previously, Daniel was the CTO of Care, Dana Cita (YC '18).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant