👋 Welcome to RefineBench — a comprehensive evaluation library for testing refinement capabilities of language models across multiple settings and domains. To reproduce the full results reported in ...