This project is an example of how to use Niantic Studio and Gemini to pull images from the user screen and get textual feedback from Gemini.
Studio: Gemini Image Sample Generation
Overview
This project showcases how to use Niantic Studio's AR capabilities to capture a screenshot of the user’s current view, send it to Gemini (an LLM), and generate a formatted response. The response is segmented into a quiz question and four multiple-choice answers, which are then displayed in the AR environment.
Must add your own Gemini API Key in random-quote-api.ts for the project to work. Replace your key in the place where it says ADD_KEY_HERE
Controls
- Screenshot Button:
Clicking the button captures the current AR view, sends the screenshot to Gemini, and displays the generated question and answers.
Components
Random Quote API
Description:
This component handles the screenshot capture, interaction with Gemini's API, and updating the UI with the generated quiz content.
Schema:
- Question: Entity ID for the quiz question UI element.
- Ans1: Entity ID for the first answer UI element.
- Ans2: Entity ID for the second answer UI element.
- Ans3: Entity ID for the third answer UI element.
- Ans4: Entity ID for the fourth answer UI element.
Data:
- content: A string that stores data sent to and received from the Gemini API.
Functionality:
- Screenshot Capture: Uses to capture the current AR view.
- API Request: Sends the screenshot along with a formatted prompt to the Gemini API.
- Response Handling: Parses the API response, extracting the question and answers.
- UI Update: Updates the AR interface to display the generated quiz question and answers.
Project Setup
Main Entities:
- UI Elements: , , , , are linked to AR UI elements for displaying the quiz.
- Button: Triggers the screenshot and API request.
Organization:
- Components are registered using Niantic Studio's ECS framework.
- Events are bound to UI elements for interaction.
Customization
- Modifying the Prompt:
Update the variable in the code to customize the type of quiz or content generated by Gemini. - Adjusting UI Elements:
Change the properties of , , , , and entities in the schema to match your AR design.
Technical Details
- Screenshot Configuration:
The screenshot resolution is limited to 640px with 50% JPEG compression for optimal performance. API Key:
Replace the placeholder API key in the fetch request with your Gemini API key.Error Handling:
Includes error handling for both screenshot capture and API requests to ensure robustness.
Input Manager
Mapped Actions:
- Click: Captures the screenshot and sends it to Gemini.
Asset Attribution
- Gemini API:
Provided by Google Generative Language APIs.
Additional Notes
- Ensure that the XR8 module is properly initialized before attempting to capture screenshots.
- The array contains the question and all four answers, separated by .
Other Work by GTXR Club