Deep Tech Point
first stop in your tech adventure

Scale AI’s Spellbook

January 17, 2024 | AI

Scale AI’s Spellbook is an innovative platform designed to facilitate the building, evaluating, and deploying of applications powered by large language models (LLMs). Scale AI’s Spellbook offers a streamlined process that simplifies the interaction with these complex models, making it more accessible for developers and organizations to harness their capabilities.

The key features of Spellbook include:

  1. Easy Deployment: Users can quickly deploy their applications with a simple process. It starts with optionally uploading data for evaluation, defining the application prompt and settings, and then simply pressing “Deploy.” This creates a URL to which users can post requests to get started.
  2. Experimentation and Comparison: Spellbook allows users to compare experiments across different models, prompts, and fine-tuning strategies. This comparative analysis is crucial for optimizing the performance of LLM applications.
  3. Production-Ready API Endpoints: One of the standout features of Spellbook is the ability to deploy promising variants to production-ready API endpoints. This feature comes with built-in monitoring and analytics, enabling users to track the performance and effectiveness of their applications in real-time.
  4. Comprehensive Guides and Documentation: For new users, Spellbook provides extensive documentation, including usage guides and recipes for working through the API. This resource is invaluable for both beginners and experienced users to get the most out of the platform.
  5. Variety of Applications: The platform is versatile, accommodating a range of applications from different domains. Whether it’s for educational purposes, business applications, or research, Spellbook offers the tools and flexibility to tailor LLMs to specific needs.
  6. Community and Support: Scale AI has created a community around Spellbook, offering support and updates. Users can find help for troubleshooting, share experiences, and stay updated with the latest enhancements to the platform.

Spellbook represents a significant step forward in the democratization of AI technology, particularly in the field of large language models. It opens up new possibilities for developers and businesses to innovate and experiment with AI, without needing deep expertise in the underlying technology.

From Idea to Deployment: How do I get started?

First, if you haven’t already, sign up at Spellbook for free and you’ll receive $10 in free credits. Next, click on “Create App” and create your first app by Adding a super cool app name and adding optional description about your app. So, I wasn’t feeling super inspired, but I wanted to create something simple, some kind of experiment to see how Spellbook works and this is how Tarot card interpretor for a single Tarot card in the context of predicting future for a specific demographic group came up. My prompt was: Write me a Tarot card {{ input }} interpretation for {{ future }} targeting the demographic {{ demographic }}. I used ChatGPT3.5 Turbo, set the Temperature to 0.9 and maximum Token to 250 as suggested.

The experiment was with a card 8 of swords, the future variable was “what will be my love life” and demographic was “single 45-year old woman”.
After “generating output” this is what came up:
The Eight of Swords represents a stage of feeling trapped or limited in your love life, but it also signifies the potential for breaking free from these confines. For a single 45-year-old woman seeking love, this card suggests that you might be experiencing self-imposed limitations or restrictive beliefs that hinder your romantic prospects.

In your current situation, it is possible that you have built walls around your heart due to past disappointments or failed relationships. These walls may have shielded you from getting hurt but have also prevented new love from entering your life. The Eight of Swords urges you to examine these self-imposed restrictions and question whether they are serving your highest good.

It is important to realize that love is a two-way street, and by hoisting these barriers, you unintentionally restrict the flow of love into your life. You may have developed certain expectations or standards that have become overly rigid, making it challenging for potential partners to meet your criteria. This card encourages you to reevaluate your expectations and consider being more open-minded and flexible.

The Eight of Swords also highlights the significance of communication. It may be time for you to express your desires and intentions more clearly. Speak your truth, express your needs, and allow yourself to be vulnerable. When you open {cut off because of number of tokens}.

So, that was fun 🙂 “My app” took the info from ChatGPT and transformed it to answer about the input (8 of swords) to my demographic profile (45 yo single woman) with a questions about her future (love life).

Yes, I know, I was super simplistic, my prompts were a joke, and I didn’t even bother to provide a few-shot examples. The aim was to see how the variables work and what kind of output the Spellbook provides.

Next, you can play with fine tuning, and if you want to create a new fine-tuned variant, you must specify a few parameters, such as

  1. Fine-tuned variant name
  2. Fine-tuned model name
  3. Base model
  4. Train dataset

This was a quick experiment and I didn’t have any examples prepared, but according to Scale Spellbook’s Fine Tuning guide, if you’re working on classification tasks, 100 examples is enough, while for tasks such as summarization, they recommend at least 500 examples.

Next, as you fine tune you can save a variants of your app, and when ready deploy your app aka use your LLM in production!

If you create several App variants, you can compare them, or you can proceed with evaluation, a process in Scale’s Spellbook that is designed to assess the performance of large language models. You can upload your data, and the system evaluates the model’s responses against this data. This helps in understanding how well the model performs on specific tasks or with particular datasets. The evaluation process is crucial for identifying areas where the model excels or needs improvement, and this is the point where we can further fine-tuning or adjust the app to improve its performance.