如何基于Serverless云函数 SCF+Kaggle端到端验证码识别从训练到部署
如何基于Serverless云函数 SCF+Kaggle端到端验证码识别从训练到部署
训练一个用于识别验证码的模型并部署到云端,可以使用Kaggle上的开源数据集,使用Serverless云函数SCF实现端到端的验证码识别。
数据集:
开源数据集:https://www.kaggle.com/c/captcha-version-2-images/data
训练集:
验证集:
测试集:
模型训练:
使用TensorFlow训练模型,代码如下:
import tensorflow as tf from tensorflow.keras import layers import matplotlib.pyplot as plt import numpy as np import os from PIL import Image def plot_history(history): fig, axs = plt.subplots(1, 2, figsize=(10, 5)) # summarize history for accuracy axs[0].plot(history.history['accuracy'], label='train') axs[0].plot(history.history['val_accuracy'], label='test') axs[0].set_title('Model Accuracy') axs[0].set_ylabel('Accuracy') axs[0].set_xlabel('Epoch') axs[0].legend(loc='upper left') # summarize history for loss axs[1].plot(history.history['loss'], label='train') axs[1].plot(history.history['val_loss'], label='test') axs[1].set_title('Model Loss') axs[1].set_ylabel('Loss') axs[1].set_xlabel('Epoch') axs[1].legend(loc='upper left') plt.show() def create_model(): model = tf.keras.models.Sequential([ layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)), layers.MaxPooling2D((2, 2)), layers.Conv2D(64, (3, 3), activation='relu'), layers.MaxPooling2D((2, 2)), layers.Conv2D(64, (3, 3), activation='relu'), layers.Flatten(), layers.Dense(64, activation='relu'), layers.Dense(10, activation='softmax') ]) model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy']) return model def load_data(): train_dir = './captcha-version-2-images/train' test_dir = './captcha-version-2-images/test' train_dataset = tf.keras.preprocessing.image_dataset_from_directory( train_dir, validation_split=0.2, subset="training", seed=123, image_size=(28, 28), batch_size=64) test_dataset = tf.keras.preprocessing.image_dataset_from_directory( test_dir, image_size=(28, 28), batch_size=64) return train_dataset, test_dataset if __name__ == '__main__': tf.enable_eager_execution() train_dataset, test_dataset = load_data() model = create_model() history = model.fit( train_dataset, epochs=10, validation_data=test_dataset, verbose=1) plot_history(history)
模型训练结果如下:
模型评估:
model.evaluate(test_dataset)
模型评估结果如下:
模型保存:
model.save('captcha_model.h5')
模型部署:
使用Serverless云函数SCF,代码如下:
import json from tencentcloud.common import credential from tencentcloud.common.profile.client_profile import ClientProfile from tencentcloud.common.profile.http_profile import HttpProfile from tencentcloud.scf.v20180416 import scf_client, models def main_handler(event, context): cred = credential.Credential( os.environ.get("TENCENTCLOUD_SECRET_ID"), os.environ.get("TENCENTCLOUD_SECRET_KEY")) httpProfile = HttpProfile() httpProfile.endpoint = "scf.tencentcloudapi.com" clientProfile = ClientProfile() clientProfile.httpProfile = httpProfile client = scf_client.ScfClient(cred, "ap-shanghai", clientProfile) try: resp = client.InvokeFunction( models.InvokeFunctionRequest( FunctionName=os.environ.get("SCF_FUNCTION_NAME"), Qualifier="$LATEST", ClientContext=json.dumps(event), LogType="Tail", Namespace="default", RequestId=context.request_id ) ) print(resp.to_json_string()) except TencentCloudSDKException as err: print(err)
使用Kaggle的验证码数据集进行测试,代码如下:
import requests import matplotlib.pyplot as plt import numpy as np from PIL import Image def show_img(path): img = Image.open(path) plt.imshow(img) def main(): url = "https://xxxxx.execute-api.ap-shanghai.amazonaws.com/Test/captcha" response = requests.get(url) if response.status_code == 200: img = Image.open(BytesIO(response.content)) show_img(img) else: print("error code: {}".format(response.status_code)) if __name__ == '__main__': main()
验证码识别结果如下:
本文通过Serverless云函数SCF+Kaggle的验证码数据集实现了端到端的验证码识别,包括数据集准备、模型训练、模型部署、验证码识别等过程,实现了验证码识别的自动化。
相关文章