kennysong

joined 1 year ago
 

Hi! I wanted to share LangCheck, an open source toolkit to evaluate LLM applications (GitHub, Quickstart).

It already supports English and Japanese text, and more languages soon – contributions welcome!

Core functionality:

  • langcheck.metrics – metrics to evaluate quality & structure of LLM-generated text
  • langcheck.plot – interactive visualizations of text quality
  • langcheck.augment – text augmentations to perturb prompts, references, etc (coming soon)

Super open to feedback & curious how other people think about evaluation for LLM apps.

[–] kennysong@alien.top 1 points 1 year ago

If you're open to using an open source library, you can use LangCheck to monitor and visualize text quality metrics in production.

For example, you can compute & plot toxicity of users prompts and LLM responses from your logs. (A very simple example here.)

(Disclaimer: I'm one of the contributors of LangCheck)