| | bedrock_tracing_and_evals_tutorial.ipynb | 17.2 KB |
| | build_benchmark_dataset_and_custom_evaluator.ipynb | 20.8 KB |
| | CoT_explanations_simple_vs_complex_evals.ipynb | 45.1 KB |
| | evals_quickstart.ipynb | 11.1 KB |
| | evaluate_agent_parameter_extraction_classifications.ipynb | 951.3 KB |
| | evaluate_agent_tool_calling_classifications.ipynb | 893.9 KB |
| | evaluate_agent_tool_selection_classifications.ipynb | 892.3 KB |
| | evaluate_agent.ipynb | 89.9 KB |
| | evaluate_code_functionality_classifications.ipynb | 203.0 KB |
| | evaluate_code_readability_classifications.ipynb | 166.8 KB |
| | evaluate_hallucination_classifications.ipynb | 141.5 KB |
| | evaluate_human_vs_ai_classifications.ipynb | 159.6 KB |
| | evaluate_QA_classifications.ipynb | 123.2 KB |
| | evaluate_rag_haystack.ipynb | 32.7 KB |
| | evaluate_rag.ipynb | 44.5 KB |
| | evaluate_reference_link_correctness_classifications.ipynb | 1.2 MB |
| | evaluate_relevance_classifications.ipynb | 181.7 KB |
| | evaluate_summarization_classifications.ipynb | 187.6 KB |
| | evaluate_toxicity_classifications.ipynb | 112.7 KB |
| | evaluate_user_frustration_classifications.ipynb | 1021.8 KB |
| | evaluations_with_error_handling.ipynb | 55.1 KB |
| | google_adk_financial_advisor.ipynb | 31.0 KB |
| | multilingual_text2cypher_evals.ipynb | 25.2 KB |
| | openai_agents_cookbook.ipynb | 28.3 KB |
| | optimizing_llm_as_a_judge_prompts.ipynb | 28.9 KB |
| | pydantic-evals.ipynb | 19.3 KB |
| | session_level_evals.ipynb | 31.4 KB |
| | trace_level_evals.ipynb | 23.2 KB |