Pass the Databricks Generative AI Engineer Databricks-Generative-AI-Engineer-Associate Questions and answers with ValidTests

Exam Databricks-Generative-AI-Engineer-Associate All Questions

Exam Databricks-Generative-AI-Engineer-Associate Premium Access

View all detail and faqs for the Databricks-Generative-AI-Engineer-Associate exam

Go to Exam

Viewing page 2 out of 3 pages

Viewing questions 11-20 out of questions

Questions # 11:

A Generative AI Engineer developed an LLM application using the provisioned throughput Foundation Model API. Now that the application is ready to be deployed, they realize their volume of requests are not sufficiently high enough to create their own provisioned throughput endpoint. They want to choose a strategy that ensures the best cost-effectiveness for their application.

What strategy should the Generative AI Engineer use?

Options:

Switch to using External Models instead

Deploy the model using pay-per-token throughput as it comes with cost guarantees

Change to a model with a fewer number of parameters in order to reduce hardware constraint issues

Throttle the incoming batch of requests manually to avoid rate limiting issues

Expert Solution

Questions # 12:

Generative AI Engineer at an electronics company just deployed a RAG application for customers to ask questions about products that the company carries. However, they received feedback that the RAG response often returns information about an irrelevant product.

What can the engineer do to improve the relevance of the RAG’s response?

Options:

Assess the quality of the retrieved context

Implement caching for frequently asked questions

Use a different LLM to improve the generated response

Use a different semantic similarity search algorithm

Expert Solution

Answer

Questions # 13:

A Generative Al Engineer is building an LLM-based application that has an

important transcription (speech-to-text) task. Speed is essential for the success of the application

Which open Generative Al models should be used?

Options:

L!ama-2-70b-chat-hf

MPT-30B-lnstruct

DBRX

whisper-large-v3 (1.6B)

Expert Solution

Answer

Explanation

The task requires an open generative AI model for a transcription (speech-to-text) task where speed is essential. Let’s assess the options based on their suitability for transcription and performance characteristics, referencing Databricks’ approach to model selection.

Option A: Llama-2-70b-chat-hf

Llama-2 is a text-based LLM optimized for chat and text generation, not speech-to-text. It lacks transcription capabilities.

Databricks Reference:"Llama models are designed for natural language generation, not audio processing"("Databricks Model Catalog").

Option B: MPT-30B-Instruct

MPT-30B is another text-based LLM focused on instruction-following and text generation, not transcription. It’s irrelevant for speech-to-text tasks.

Databricks Reference: No specific mention, but MPT is categorized under text LLMs in Databricks’ ecosystem, not audio models.

Option C: DBRX

DBRX, developed by Databricks, is a powerful text-based LLM for general-purpose generation. It doesn’t natively support speech-to-text and isn’t optimized for transcription.

Databricks Reference:"DBRX excels at text generation and reasoning tasks"("Introducing DBRX," 2023)—no mention of audio capabilities.

Option D: whisper-large-v3 (1.6B)

Whisper, developed by OpenAI, is an open-source model specifically designed for speech-to-text transcription. The “large-v3” variant (1.6 billion parameters) balances accuracy and efficiency, with optimizations for speed via quantization or deployment on GPUs—key for the application’s requirements.

Databricks Reference:"For audio transcription, models like Whisper are recommended for their speed and accuracy"("Generative AI Cookbook," 2023). Databricks supports Whisper integration in its MLflow or Lakehouse workflows.

Conclusion: OnlyD. whisper-large-v3is a speech-to-text model, making it the sole suitable choice. Its design prioritizes transcription, and its efficiency (e.g., via optimized inference) meets the speed requirement, aligning with Databricks’ model deployment best practices.

Questions # 14:

A Generative AI Engineer has been asked to design an LLM-based application that accomplishes the following business objective: answer employee HR questions using HR PDF documentation.

Which set of high level tasks should the Generative AI Engineer's system perform?

Options:

Calculate averaged embeddings for each HR document, compare embeddings to user query to find the best document. Pass the best document with the user query into an LLM with a large context window to generate a response to the employee.

Use an LLM to summarize HR documentation. Provide summaries of documentation and user query into an LLM with a large context window to generate a response to the user.

Create an interaction matrix of historical employee questions and HR documentation. Use ALS to factorize the matrix and create embeddings. Calculate the embeddings of new queries and use them to find the best HR documentation. Use an LLM to generate a response to the employee question based upon the documentation retrieved.

Split HR documentation into chunks and embed into a vector store. Use the employee question to retrieve best matched chunks of documentation, and use the LLM to generate a response to the employee based upon the documentation retrieved.

Expert Solution

Answer

Questions # 15:

A team wants to serve a code generation model as an assistant for their software developers. It should support multiple programming languages. Quality is the primary objective.

Which of the Databricks Foundation Model APIs, or models available in the Marketplace, would be the best fit?

Options:

Llama2-70b

BGE-large

MPT-7b

CodeLlama-34B

Expert Solution

Questions # 16:

A Generative Al Engineer has built an LLM-based system that will automatically translate user text between two languages. They now want to benchmark multiple LLM's on this task and pick the best one. They have an evaluation set with known high quality translation examples. They want to evaluate each LLM using the evaluation set with a performant metric.

Which metric should they choose for this evaluation?

Options:

ROUGE metric

BLEU metric

NDCG metric

RECALL metric

Expert Solution

Answer

Questions # 17:

A Generative Al Engineer is creating an LLM system that will retrieve news articles from the year 1918 and related to a user's query and summarize them. The engineer has noticed that the summaries are generated well but often also include an explanation of how the summary was generated, which is undesirable.

Which change could the Generative Al Engineer perform to mitigate this issue?

Options:

Split the LLM output by newline characters to truncate away the summarization explanation.

Tune the chunk size of news articles or experiment with different embedding models.

Revisit their document ingestion logic, ensuring that the news articles are being ingested properly.

Provide few shot examples of desired output format to the system and/or user prompt.

Expert Solution

Questions # 18:

Which TWO chain components are required for building a basic LLM-enabled chat application that includes conversational capabilities, knowledge retrieval, and contextual memory?

Options:

(Q)

Vector Stores

Conversation Buffer Memory

External tools

Chat loaders

React Components

Expert Solution

Answer

B, C

Explanation

Building a basic LLM-enabled chat application with conversational capabilities, knowledge retrieval, and contextual memory requires specific components that work together to process queries, maintain context, and retrieve relevant information. Databricks’ Generative AI Engineer documentation outlines key components for such systems, particularly in the context of frameworks like LangChain or Databricks’ MosaicML integrations. Let’s evaluate the required components:

Understanding the Requirements:

Conversational capabilities: The app must generate natural, coherent responses.

Knowledge retrieval: It must access external or domain-specific knowledge.

Contextual memory: It must remember prior interactions in the conversation.

Databricks Reference:"A typical LLM chat application includes a memory component to track conversation history and a retrieval mechanism to incorporate external knowledge"("Databricks Generative AI Cookbook," 2023).

Evaluating the Options:

A. (Q): This appears incomplete or unclear (possibly a typo). Without further context, it’s not a valid component.

B. Vector Stores: These store embeddings of documents or knowledge bases, enabling semantic search and retrieval of relevant information for the LLM. This is critical for knowledge retrieval in a chat application.

Databricks Reference:"Vector stores, such as those integrated with Databricks’ Lakehouse, enable efficient retrieval of contextual data for LLMs"("Building LLM Applications with Databricks").

C. Conversation Buffer Memory: This component stores the conversation history, allowing the LLM to maintain context across multiple turns. It’s essential for contextual memory.

Databricks Reference:"Conversation Buffer Memory tracks prior user inputs and LLM outputs, ensuring context-aware responses"("Generative AI Engineer Guide").

D. External tools: These (e.g., APIs or calculators) enhance functionality but aren’t required for abasicchat app with the specified capabilities.

E. Chat loaders: These might refer to data loaders for chat logs, but they’re not a core chain component for conversational functionality or memory.

F. React Components: These relate to front-end UI development, not the LLM chain’s backend functionality.

Selecting the Two Required Components:

Forknowledge retrieval, Vector Stores (B) are necessary to fetch relevant external data, a cornerstone of Databricks’ RAG-based chat systems.

Forcontextual memory, Conversation Buffer Memory (C) is required to maintain conversation history, ensuring coherent and context-aware responses.

While an LLM itself is implied as the core generator, the question asks for chain components beyond the model, making B and C the minimal yet sufficient pair for a basic application.

Conclusion: The two required chain components areB. Vector StoresandC. Conversation Buffer Memory, as they directly address knowledge retrieval and contextual memory, respectively, aligning with Databricks’ documented best practices for LLM-enabled chat applications.

Questions # 19:

All of the following are Python APIs used to query Databricks foundation models. When running in an interactive notebook, which of the following libraries does not automatically use the current session credentials?

Options:

OpenAI client

REST API via requests library

MLflow Deployments SDK

Databricks Python SDK

Questions # 20:

A Generative AI Engineer just deployed an LLM application at a digital marketing company that assists with answering customer service inquiries.

Which metric should they monitor for their customer service LLM application in production?

Options:

Number of customer inquiries processed per unit of time

Energy usage per query

Final perplexity scores for the training of the model

HuggingFace Leaderboard values for the base LLM

Answer

Viewing page 2 out of 3 pages

Viewing questions 11-20 out of questions

Pre-Summer Special Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: validbest

Pass the Databricks Generative AI Engineer Databricks-Generative-AI-Engineer-Associate Questions and answers with ValidTests

Exam Databricks-Generative-AI-Engineer-Associate Premium Access

Options:

Options:

Options:

Options:

Options:

Options:

Options:

Options:

Options:

Options: