kennysong

joined 10 months ago
 

Hi! I wanted to share LangCheck, an open source toolkit to evaluate LLM applications (GitHub, Quickstart).

It already supports English and Japanese text, and more languages soon – contributions welcome!

Core functionality:

  • langcheck.metrics – metrics to evaluate quality & structure of LLM-generated text
  • langcheck.plot – interactive visualizations of text quality
  • langcheck.augment – text augmentations to perturb prompts, references, etc (coming soon)

Super open to feedback & curious how other people think about evaluation for LLM apps.

[–] kennysong@alien.top 1 points 10 months ago

If you're open to using an open source library, you can use LangCheck to monitor and visualize text quality metrics in production.

For example, you can compute & plot toxicity of users prompts and LLM responses from your logs. (A very simple example here.)

(Disclaimer: I'm one of the contributors of LangCheck)