| | evals_quickstart.ipynb | 8.2 KB |
| | evals_sql_correctness_eval_with_custom_agent.ipynb | 177.8 KB |
| | evaluate_code_functionality_classifications.ipynb | 202.9 KB |
| | evaluate_code_readability_classifications.ipynb | 166.8 KB |
| | evaluate_hallucination_classifications.ipynb | 141.5 KB |
| | evaluate_human_vs_ai_classifications.ipynb | 159.6 KB |
| | evaluate_QA_classifications.ipynb | 123.2 KB |
| | evaluate_rag_haystack.ipynb | 22.0 KB |
| | evaluate_rag.ipynb | 137.4 KB |
| | evaluate_reference_link_correctness_classifications.ipynb | 1.2 MB |
| | evaluate_relevance_classifications.ipynb | 181.7 KB |
| | evaluate_summarization_classifications.ipynb | 187.6 KB |
| | evaluate_tool_calling.ipynb | 72.2 KB |
| | evaluate_toxicity_classifications.ipynb | 112.7 KB |
| | evaluate_user_frustration_classifications.ipynb | 1021.8 KB |
| | evaluations_with_error_handling.ipynb | 55.1 KB |
| | local_llm_evals.ipynb | 19.0 KB |
| | local_llm.ipynb | 5.0 KB |