コスト半額！？Geminiのバッチ予測がGAされたので試してみた！

DX開発事業部の西田です。

Batch prediction | Generative AI on Vertex AI | Google Cloud

2024年11月8日にGeminiのバッチ予測が一般提供（GA）されました！
Gemini 1.0 Pro、Gemini 1.5 Pro、Gemini 1.5 Flashで利用できます。

バッチ予測は多数のマルチモーダルプロンプトを一括で処理することができます。
入出力はCloud StorageとBigQueryが選択できます。
受け取る処理結果は非同期になりますが、標準のリクエストよりなんと50%割引されるメリットがあります。
注意点としてオフピークのキャパシティを利用しているため、処理時間の予測が立てづらい点があります。

公式のサンプルコードをGoogle Colabで試してみました。

設定、ライブラリインストール、認証

#@title 設定
#@markdown ## https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/batch-prediction-gemini
#@markdown ## 環境に合わせて変更
PROJECT_ID = 'my-project' # @param {type:'string'}
LOCATION = 'asia-northeast1'  # @param {type:'string'}
CLOUD_STORAGE_BUCKET = 'my-bucket' # @param {type:'string'}
DATASET_ID = 'my-dataset' # @param {type:'string'}

#@title ライブラリインストール
!pip install google-cloud-aiplatform==1.71.1 google-cloud-storage==2.18.2 google-cloud-bigquery==3.26.0

#@title Colabから使うため認証を通す
from google.colab import auth
auth.authenticate_user()

結果をCloud Storageに出力

#@title JSONLを作成しCloud Storageにアップロード
import json
from google.cloud import storage


data = [
    {
        "request": {
            "contents": [
                {
                    "role": "user",
                    "parts": [
                        {
                            "text": "次のビデオと画像サンプルの関係は何ですか？"
                        },
                        {
                            "fileData": {
                                "fileUri": "gs://cloud-samples-data/generative-ai/video/animals.mp4", "mimeType": "video/mp4"
                            }
                        },
                        {
                            "fileData": {
                                "fileUri": "gs://cloud-samples-data/generative-ai/image/cricket.jpeg", "mimeType": "image/jpeg"
                            }
                        }
                    ]
                }
            ]
        }
    },
    {
        "request": {
            "contents": [
                {
                    "role": "user",
                    "parts": [
                        {
                            "text": "内容を説明して。"
                        },
                        {
                            "fileData": {
                                "fileUri": "gs://cloud-samples-data/generative-ai/video/pixel8.mp4", "mimeType": "video/mov"
                            }
                        }
                    ]
                }
            ],
            "system_instruction": {
              "parts": [
                {
                  "text": "あなたは動画解析を行うAIアシスタントです。"
                }
              ]
  }
        }
    }
]

source_file_name = "data.jsonl"
with open(source_file_name, "w") as f:
    for entry in data:
        f.write(json.dumps(entry) + "\n")

storage_client = storage.Client()

destination_blob_name = "input/data.jsonl"
bucket = storage_client.bucket(CLOUD_STORAGE_BUCKET)
blob = bucket.blob(destination_blob_name)

blob.upload_from_filename(source_file_name)

print(f"File {source_file_name} uploaded to {destination_blob_name}.")

#@title バッチ予測ジョブをリクエスト
import time
import vertexai

from vertexai.batch_prediction import BatchPredictionJob


vertexai.init(project=PROJECT_ID, location=LOCATION)

input_uri = f"gs://{CLOUD_STORAGE_BUCKET}/{destination_blob_name}"
output_uri = f"gs://{CLOUD_STORAGE_BUCKET}/output/"

batch_prediction_job = BatchPredictionJob.submit(
    source_model="gemini-1.5-flash-002",
    input_dataset=input_uri,
    output_uri_prefix=output_uri,
)

# ジョブのステータスチェック
print(f"Job resource name: {batch_prediction_job.resource_name}")
print(f"Model resource name with the job: {batch_prediction_job.model_name}")
print(f"Job state: {batch_prediction_job.state.name}")

# 完了するまでリフレッシュ
while not batch_prediction_job.has_ended:
    time.sleep(5)
    batch_prediction_job.refresh()

# 完了ステータスのチェック
if batch_prediction_job.has_succeeded:
    print("Job succeeded!")
else:
    print(f"Job failed: {batch_prediction_job.error}")

# 出力先の確認
print(f"Job output location: {batch_prediction_job.output_location}")

#@title 出力結果を確認
gs_url = batch_prediction_job.output_location

path_parts = gs_url[5:].split("/", 1)
bucket_name = path_parts[0]
blob_path = path_parts[1] if len(path_parts) > 1 else ""

storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name)
blob = bucket.blob(blob_path + "/predictions.jsonl")

file_content = blob.download_as_text()
for line in file_content.strip().splitlines():
    json_object = json.loads(line)
    print(json.dumps(json_object, indent=4, ensure_ascii=False))

出力結果

{
    "status": "",
    "processed_time": "2024-11-11T13:34:57.875+00:00",
    "request": {
        "contents": [
            {
                "parts": [
                    {
                        "fileData": null,
                        "text": "内容を説明して。"
                    },
                    {
                        "fileData": {
                            "fileUri": "gs://cloud-samples-data/generative-ai/video/pixel8.mp4",
                            "mimeType": "video/mov"
                        },
                        "text": null
                    }
                ],
                "role": "user"
            }
        ],
        "system_instruction": {
            "parts": [
                {
                    "text": "あなたは動画解析を行うAIアシスタントです。"
                }
            ]
        }
    },
    "response": {
        "candidates": [
            {
                "avgLogprobs": -0.19646367963576158,
                "content": {
                    "parts": [
                        {
                            "text": "この動画は、Google Pixel 8のプロモーションビデオです。\n\n東京のフォトグラファーである島田さえかさんが、Pixel 8を使って夜の東京の街を撮影する様子が映し出されています。\n\n動画では、Pixel 8の新しい機能である「Video Boost」が紹介されています。この機能は、暗い場所でも高画質の動画撮影を可能にするナイトサイト機能を自動で起動するものです。\n\n島田さんは、最初に三茶と呼ばれる地域を訪れ、思い出の場所をPixel 8で撮影します。その後、渋谷に移動し、賑やかな街の雰囲気を捉えます。\n\n動画全体を通して、夜の東京の美しい風景と、Pixel 8の高性能カメラが強調されています。"
                        }
                    ],
                    "role": "model"
                },
                "finishReason": "STOP"
            }
        ],
        "modelVersion": "gemini-1.5-flash-002@default",
        "usageMetadata": {
            "candidatesTokenCount": 151,
            "promptTokenCount": 16830,
            "totalTokenCount": 16981
        }
    }
}
{
    "status": "",
    "processed_time": "2024-11-11T13:34:57.869+00:00",
    "request": {
        "contents": [
            {
                "parts": [
                    {
                        "fileData": null,
                        "text": "次のビデオと画像サンプルの関係は何ですか？"
                    },
                    {
                        "fileData": {
                            "fileUri": "gs://cloud-samples-data/generative-ai/video/animals.mp4",
                            "mimeType": "video/mp4"
                        },
                        "text": null
                    },
                    {
                        "fileData": {
                            "fileUri": "gs://cloud-samples-data/generative-ai/image/cricket.jpeg",
                            "mimeType": "image/jpeg"
                        },
                        "text": null
                    }
                ],
                "role": "user"
            }
        ],
        "system_instruction": null
    },
    "response": {
        "candidates": [
            {
                "avgLogprobs": -0.8414706354555876,
                "content": {
                    "parts": [
                        {
                            "text": "これは、Google フォト広告のビデオです。このビデオは、ロサンゼルス動物園の動物が、Google フォトを使用して自分の自撮り写真を撮っている様子を描写しています。ビデオでは、さまざまな動物が自撮りをする様子と、Google フォトのアプリを使用して写真がバックアップされ、ソーシャルメディアで共有されている様子が紹介されています。"
                        }
                    ],
                    "role": "model"
                },
                "finishReason": "STOP"
            }
        ],
        "modelVersion": "gemini-1.5-flash-002@default",
        "usageMetadata": {
            "candidatesTokenCount": 69,
            "promptTokenCount": 29177,
            "totalTokenCount": 29246
        }
    }
}

結果をBigQueryに出力

#@title BigQueryテーブルを作成
from google.cloud import bigquery
from google.api_core.exceptions import Conflict


client = bigquery.Client()

table_id = "predictions"

table_ref = f"{PROJECT_ID}.{DATASET_ID}.{table_id}"

schema = [
    bigquery.SchemaField("status", "STRING", mode="NULLABLE"),
    bigquery.SchemaField("processed_time", "TIMESTAMP", mode="NULLABLE"),
    bigquery.SchemaField("request", "STRING", mode="NULLABLE"),
    bigquery.SchemaField("response", "STRING", mode="NULLABLE"),
]

table = bigquery.Table(table_ref, schema=schema)

try:
    table = client.create_table(table)
    print(f"Table {table.table_id} created successfully.")
except Conflict:
    print(f"Table {table.table_id} already exists.")

#@title BigQueryテーブルに推論結果を格納

output_uri = f"bq://{PROJECT_ID}.{DATASET_ID}.{table_id}"

# Submit a batch prediction job with Gemini model
batch_prediction_job = BatchPredictionJob.submit(
    source_model="gemini-1.5-flash-002",
    input_dataset=input_uri,
    output_uri_prefix=output_uri,
)

# ジョブのステータスチェック
print(f"Job resource name: {batch_prediction_job.resource_name}")
print(f"Model resource name with the job: {batch_prediction_job.model_name}")
print(f"Job state: {batch_prediction_job.state.name}")

# 完了するまでリフレッシュ
while not batch_prediction_job.has_ended:
    time.sleep(5)
    batch_prediction_job.refresh()

# 完了ステータスのチェック
if batch_prediction_job.has_succeeded:
    print("Job succeeded!")
else:
    print(f"Job failed: {batch_prediction_job.error}")

# 出力先の確認
print(f"Job output location: {batch_prediction_job.output_location}")

#@title テーブルを確認
sql = f"""
        SELECT *
        FROM `{table_ref}`
        """

client = bigquery.Client(project=PROJECT_ID) 
query_result = client.query(sql)

df = query_result.result().to_dataframe()
df.head()

出力結果

status	processed_time	request	response
	2024-11-11 14:01:41.491000+00:00	{“contents”:[{“parts”:[{“fileData”:null,”text”:”内容を説明して。”},{“fileData”:{“fileUri”:”gs://cloud-samples-data/generative-ai/video/pixel8.mp4″,”mimeType”:”video/mov”},”text”:null}],”role”:”user”}],”system_instruction”:{“parts”:[{“text”:”あなたは動画解析を行うAIアシスタントです。”}]}}	{“candidates”:[{“avgLogprobs”:-0.3572149152879591,”content”:{“parts”:[{“text”:”この動画は、Google Pixel の新しい機能「ビデオブースト」を紹介する広告です。\n\n東京のフォトグラファーである島田さえかさんが、夜の東京の街をPixelで撮影しながら、その魅力を語っています。\n\nビデオブーストは、暗い場所でも「ナイトサイト」機能が自動的に起動し、動画の画質を向上させる機能です。\n\n動画では、三茶と渋谷の街並みが、Pixel 8 Proによって美しく撮影されています。特に、ナイトサイトが有効になっている様子が印象的で、暗い場所でも鮮明で美しい映像を捉えていることがわかります。\n\n全体を通して、夜の東京の多様な魅力とPixel 8 Proの高性能カメラが融合した、洗練された広告となっています。”}],”role”:”model”},”finishReason”:”STOP”}],”modelVersion”:”gemini-1.5-flash-002@default”,”usageMetadata”:{“candidatesTokenCount”:154,”promptTokenCount”:16830,”totalTokenCount”:16984}}
	2024-11-11 14:01:41.486000+00:00	{“contents”:[{“parts”:[{“fileData”:null,”text”:”次のビデオと画像サンプルの関係は何ですか？”},{“fileData”:{“fileUri”:”gs://cloud-samples-data/generative-ai/video/animals.mp4″,”mimeType”:”video/mp4″},”text”:null},{“fileData”:{“fileUri”:”gs://cloud-samples-data/generative-ai/image/cricket.jpeg”,”mimeType”:”image/jpeg”},”text”:null}],”role”:”user”}],”system_instruction”:null}	{“candidates”:[{“avgLogprobs”:-1.1522379716237385,”content”:{“parts”:[{“text”:”これは素晴らしい質問です！動画と画像はまったく関係ありません。”}],”role”:”model”},”finishReason”:”STOP”}],”modelVersion”:”gemini-1.5-flash-002@default”,”usageMetadata”:{“candidatesTokenCount”:12,”promptTokenCount”:29177,”totalTokenCount”:29189}}

同期的なプロンプト実行より時間はかかりますが、要件がマッチすればコストを大幅に落とせる嬉しいアップデートですね！

コスト半額！？Geminiのバッチ予測がGAされたので試してみた！

設定、ライブラリインストール、認証

結果をCloud Storageに出力

出力結果

結果をBigQueryに出力

出力結果

GeminiのURL context toolがGA！Vertex AIでも使えました！

Figma AI ほぼ全機能使ってみた

(EC2) insufficient capacityになる理由

Claude Code のサブエージェント機能を試してみた！

新卒が生成AI × IaCで構成図を自動生成してみた話

コスト半額！？Geminiのバッチ予測がGAされたので試してみた！

設定、ライブラリインストール、認証

結果をCloud Storageに出力

出力結果

結果をBigQueryに出力

出力結果

関連記事Related Articles

Dialogflow CXの会話履歴ログをBigQueryにエクスポートしてGeminiで感情分析してみる

Custom Search APIを使ったGeminiへのグラウンディングを試してみる

Vertex AI GeminiでGitリポジトリを丸ごと解析！コンテキスト キャッシュを利用したマルチモーダル活用

BigQuery から PaLM API の日本語利用してGoogle の急上昇ワード要因の調査

BigQueryをLooker Studioで可視化し、Vertex AIで要約検索を実装してみた

Vertex AI GeminiでGitリポジトリを丸ごと解析！コンテキストキャッシュを利用したマルチモーダル活用