Pilot project for measuring official statistics in AI,
🔗 Live demo: https://ai-stats-measurement.lab.sspcloud.fr/
This project explores the reliability of Large Language Models (LLMs) when answering questions based on official statistical data. It compares model responses against trusted sources such as National Statistical Institutes (NSIs).
The goal of this project is to assess whether publicly available data from NSIs is machine-readable and findable. The aim is to identify whether NSIs need to take action to improve the machine-readability and discoverability of their data to better support AI systems.
- Analytics dashboard Evaluate model performance using metrics such as ARR (Accuracy Rate Ratio) per NSI and per model.
- Response inspection Load and review all responses for a given prompt.
Submitting new prompts is currently limited to CBS researchers only. Public users can explore results and existing evaluations.
kubectl get pods -n user-diegoespinosa
kubectl get pods -n user-diegoespinosa -wkubectl get svc -n user-diegoespinosakubectl get ingress -n user-diegoespinosakubectl logs -f deployment/ai-stats-backend -n user-diegoespinosa
kubectl logs --tail=200 deployment/ai-stats-backend -n user-diegoespinosakubectl rollout restart deployment ai-stats-backend -n user-diegoespinosahelm upgrade --install ai-stats ./ai-stats-measurement-chart -n user-diegoespinosahelm uninstall ai-stats -n user-diegoespinosakubectl delete pod ai-stats-postgres-0 -n user-diegoespinosakubectl delete pvc postgres-data-ai-stats-postgres-0 -n user-diegoespinosahelm uninstall ai-stats -n user-diegoespinosa
kubectl delete pvc postgres-data-ai-stats-postgres-0 -n user-diegoespinosa
helm upgrade --install ai-stats ./ai-stats-measurement-chart -n user-diegoespinosakubectl get secrets -n user-diegoespinosa
kubectl describe secret ai-stats-measurement-secrets -n user-diegoespinosakubectl rollout restart deployment ai-stats-backend -n user-diegoespinosa
kubectl rollout restart deployment ai-stats-frontend -n user-diegoespinosakubectl scale deployment ai-stats-backend --replicas=0 -n user-diegoespinosa
kubectl scale deployment ai-stats-frontend --replicas=0 -n user-diegoespinosa
kubectl scale statefulset ai-stats-postgres --replicas=0 -n user-diegoespinosakubectl scale statefulset ai-stats-postgres --replicas=1 -n user-diegoespinosa
kubectl scale deployment ai-stats-backend --replicas=1 -n user-diegoespinosa
kubectl scale deployment ai-stats-frontend --replicas=1 -n user-diegoespinosa