搜索结果
全部能力
找到 806 个相关结果 / 前端体验
研究学习 / 检索整理
openrlhf-training
openrlhf-training
High-performance RLHF framework with Ray+vLLM acceleration. Use for PPO, GRPO, RLOO, DPO training of large models (7B-70B+). Built on Ray, vLLM, ZeRO-3. 2×…
研究学习 / 检索整理
dspy
dspy
Build complex AI systems with declarative programming, optimize prompts automatically, create modular RAG systems and agents with DSPy - Stanford NLP's…
研究学习 / 检索整理
pyvene-interventions
pyvene-interventions
Provides guidance for performing causal interventions on PyTorch models using pyvene's declarative intervention framework. Use when conducting causal tracing,…
研究学习 / 检索整理
gguf-quantization
gguf-quantization
GGUF format and llama.cpp quantization for efficient CPU/GPU inference. Use when deploying models on consumer hardware, Apple Silicon, or when needing flexible…
研究学习 / 检索整理
adk-docs
adk-docs
创建、审阅、更新和搜索 ADK 文档的指南 - 当用户询问编写、维护或审计 ADK 机器人文档时使用
研究学习 / 检索整理
sparse-autoencoder-training
sparse-autoencoder-training
Provides guidance for training and analyzing Sparse Autoencoders (SAEs) using SAELens to decompose neural network activations into interpretable features. Use…
研究学习 / 检索整理
guidance
guidance
Control LLM output with regex and grammars, guarantee valid JSON/XML/code generation, enforce structured formats, and build multi-step workflows with Guidance…
研究学习 / 检索整理
AI 研究复现
ai-research-reproduction
README 优先的 AI 仓库复现主编排器。当用户需要端到端、最小可信的复现流程,且该流程会读取仓库…
研究学习 / 检索整理
pytorch-lightning
pytorch-lightning
High-level PyTorch framework with Trainer class, automatic distributed training (DDP/FSDP/DeepSpeed), callbacks system, and minimal boilerplate. Scales from…
研究学习 / 检索整理
evaluating-llms-harness
evaluating-llms-harness
Evaluates LLMs across 60+ academic benchmarks (MMLU, HumanEval, GSM8K, TruthfulQA, HellaSwag). Use when benchmarking model quality, comparing models, reporting…
研究学习 / 检索整理
constitutional-ai
constitutional-ai
Anthropic's method for training harmless AI through self-improvement. Two-phase approach - supervised learning with self-critique/revision, then RLAIF (RL from…
研究学习 / 检索整理
simpo-training
simpo-training
Simple Preference Optimization for LLM alignment. Reference-free alternative to DPO with better performance (+6.4 points on AlpacaEval 2.0). No reference model…
研究学习 / 检索整理
sentencepiece
sentencepiece
Language-independent tokenizer treating text as raw Unicode. Supports BPE and Unigram algorithms. Fast (50k sentences/sec), lightweight (6MB memory),…
研究学习 / 检索整理
distributed-llm-pretraining-torchtitan
distributed-llm-pretraining-torchtitan
Provides PyTorch-native distributed LLM pretraining using torchtitan with 4D parallelism (FSDP2, TP, PP, CP). Use when pretraining Llama 3.1, DeepSeek V3, or…
研究学习 / 检索整理
slime-rl-training
slime-rl-training
Provides guidance for LLM post-training with RL using slime, a Megatron+SGLang framework. Use when training GLM models, implementing custom data generation…
研究学习 / 检索整理
paper-self-review
paper-self-review
This skill should be used when the user asks to "review paper quality", "check paper completeness", "validate paper structure", "self-review before…
研究学习 / 检索整理
audiocraft-audio-generation
audiocraft-audio-generation
PyTorch library for audio generation including text-to-music (MusicGen) and text-to-sound (AudioGen). Use when you need to generate music from text…
研究学习 / 检索整理
fine-tuning-with-trl
fine-tuning-with-trl
Fine-tune LLMs using reinforcement learning with TRL - SFT for instruction tuning, DPO for preference alignment, PPO/GRPO for reward optimization, and reward…
研究学习 / 检索整理
skypilot-multi-cloud-orchestration
skypilot-multi-cloud-orchestration
Multi-cloud orchestration for ML workloads with automatic cost optimization. Use when you need to run training or batch jobs across multiple clouds, leverage…
研究学习 / 检索整理
nanogpt
nanogpt
Educational GPT implementation in ~300 lines. Reproduces GPT-2 (124M) on OpenWebText. Clean, hackable code for learning transformers. By Andrej Karpathy.…