插入的 403 速率限制有时会成功

2022-01-10 00:00:00 google-drive-api java

我在一个循环中插入 100 个文件.对于此测试,我已禁用退避并重试,因此如果插入失败并出现 403,我将忽略它并继续下一个文件.在 100 个文件中,我得到 63 403 个速率限制异常.

I insert 100 files in a loop. For this test I have DISABLED backoff and retry, so if an insert fails with a 403, I ignore it and proceed with the next file. Out of 100 files, I get 63 403 rate limit exceptions.

但是,在检查 Drive 时,在这 63 次失败中,有 3 次实际上成功了,即.该文件是在驱动器上创建的.如果我完成了通常的退避并重试,我最终会得到重复的插入.这证实了我在启用退避重试时看到的行为,即.从我的 100 个文件测试中,我一直看到 3-4 次重复插入.

However, on checking Drive, of those 63 failures, 3 actually succeeded, ie. the file was created on drive. Had I done the usual backoff and retry, I would have ended up with duplicated inserts. This confirms the behaviour I was seeing with backoff-retry enabled, ie. from my 100 file test, I am consistently seeing 3-4 duplicate insertions.

API 端点服务器和 Drive 存储服务器之间存在异步连接,这会导致不确定的结果,尤其是在大容量写入时.

It smells like there is an asynchronous connection between the API endpoint server and the Drive storage servers which is causing non-deterministic results, especially on high volume writes.

由于这意味着我不能依靠403 速率限制"来限制我的插入,我需要知道什么是安全的插入速率,以免触发这些计时错误.

Since this means I can't rely on "403 rate limit" to throttle my inserts, I need to know what is a safe insert rate so as not to trigger these timing bugs.

运行下面的代码,给出...

Running the code below, gives ...

Summary...
File insert attempts (a)       = 100
rate limit errors (b)          = 31
expected number of files (a-b) = 69
Actual number of files         = 73 

代码...

package com.cnw.test.servlets;

import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;

import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import com.google.api.client.auth.oauth2.Credential;
import com.google.api.client.googleapis.json.GoogleJsonError;
import com.google.api.client.googleapis.json.GoogleJsonResponseException;
import com.google.api.client.http.javanet.NetHttpTransport;
import com.google.api.client.json.jackson.JacksonFactory;
import com.google.api.services.drive.Drive;
import com.google.api.services.drive.model.ChildList;
import com.google.api.services.drive.model.File;
import com.google.api.services.drive.model.File.Labels;
import com.google.api.services.drive.model.ParentReference;

import couk.cleverthinking.cnw.oauth.CredentialMediatorB;
import couk.cleverthinking.cnw.oauth.CredentialMediatorB.InvalidClientSecretsException;

@SuppressWarnings("serial")
    /**
     * 
     * AppEngine servlet to demonstrate that Drive IS performing an insert despite throwing a 403 rate limit exception.
     * 
     * All it does is create a folder, then loop to create x files. Any 403 rate limit exceptions are counted.
     * At the end, compare the expected number of file (attempted - 403) vs. the actual.
     * In a run of 100 files, I consistently see between 1 and 3 more files than expected, ie. despite throwing a 403 rate limit,
     * Drive *sometimes* creates the file anyway.
     * 
     * To run this, you will need to ...
     * 1) enter an APPNAME above
     * 2) enter a google user id above
     * 3) Have a valid stored credential for that user
     * 
     * (2) and (3) can be replaced by a manually constructed Credential 
     * 
     * Your test must generate rate limit errors, so if you have a very slow connection, you might need to run 2 or 3 in parallel. 
     * I run the test on a medium speed connection and I see 403 rate limits after 30 or so inserts.
     * Creating 100 files consistently exposes the problem.
     * 
     */
public class Hack extends HttpServlet {

    private final String APPNAME = "MyApp";  // ENTER YOUR APP NAME
    private final String GOOGLE_USER_ID_TO_FETCH_CREDENTIAL = "11222222222222222222222"; //ENTER YOUR GOOGLE USER ID
    @Override
    public void doGet(HttpServletRequest request, HttpServletResponse response) throws IOException {
        /*
         *  set up the counters
         */
        // I run this as a servlet, so I get the number of files from the request URL
        int numFiles = Integer.parseInt(request.getParameter("numfiles"));
        int fileCount = 0;
        int ratelimitCount = 0;

        /*
         * Load the Credential
         */
        CredentialMediatorB cmb = null;
        try {
            cmb = new CredentialMediatorB(request);
        } catch (InvalidClientSecretsException e) {
            e.printStackTrace();
        }
        // this fetches a stored credential, you might choose to construct one manually
        Credential credential = cmb.getStoredCredential(GOOGLE_USER_ID_TO_FETCH_CREDENTIAL);

        /*
         * Use the credential to create a drive service
         */
        Drive driveService = new Drive.Builder(new NetHttpTransport(), new JacksonFactory(), credential).setApplicationName(APPNAME).build();

        /* 
         * make a parent folder to make it easier to count the files and delete them after the test
         */
        File folderParent = new File();
        folderParent.setTitle("403parentfolder-" + numFiles);
        folderParent.setMimeType("application/vnd.google-apps.folder");
        folderParent.setParents(Arrays.asList(new ParentReference().setId("root")));
        folderParent.setLabels(new Labels().setHidden(false));
        driveService.files().list().execute();
        folderParent = driveService.files().insert(folderParent).execute();
        System.out.println("folder made with id = " + folderParent.getId());

        /*
         * store the parent folder id in a parent array for use by each child file
         */
        List<ParentReference> parents = new ArrayList<ParentReference>();
        parents.add(new ParentReference().setId(folderParent.getId()));

        /*
         * loop for each file
         */
        for (fileCount = 0; fileCount < numFiles; fileCount++) {
            /*
             * make a File object for the insert
             */
            File file = new File();
            file.setTitle("testfile-" + (fileCount+1));
            file.setParents(parents);
            file.setDescription("description");
            file.setMimeType("text/html");

            try {
                System.out.println("making file "+fileCount + " of "+numFiles);
                // call the drive service insert execute method 
                driveService.files().insert(file).setConvert(false).execute();
            } catch (GoogleJsonResponseException e) {
                GoogleJsonError error = e.getDetails();
                // look for rate errors and count them. Normally one would expo-backoff here, but this is to demonstrate that despite
                // the 403, the file DID get created
                if (error.getCode() == 403 && error.getMessage().toLowerCase().contains("rate limit")) {
                    System.out.println("rate limit exception on file " + fileCount + " of "+numFiles);
                    // increment a count of rate limit errors
                    ratelimitCount++;
                } else {
                    // just in case there is a different exception thrown
                    System.out.println("[DbSA465] Error message: " + error.getCode() + " " + error.getMessage());
                }
            }
        }

        /* 
         * all done. get the children of the folder to see how many files were actually created
         */
        ChildList children = driveService.children().list(folderParent.getId()).execute();

        /*
         * and the winner is ...
         */
        System.out.println("
Summary...");
        System.out.println("File insert attempts (a)       = " + numFiles);
        System.out.println("rate limit errors (b)          = " + ratelimitCount);
        System.out.println("expected number of files (a-b) = " + (numFiles - ratelimitCount));
        System.out.println("Actual number of files         = " + children.getItems().size() + " NB. There is a limit of 100 children in a single page, so if you're expecting more than 100, need to follow nextPageToken");
    }
}

推荐答案

我假设您正在尝试进行并行下载...

这可能不是您要寻找的答案,但这是我在与 google drive api 交互时所经历的.我使用的是 C#,所以有点不同,但也许会有所帮助.

This may not be an answer you're looking for, but this is what I've experienced in my interactions with google drive api. I use C#, so it's a bit different, but maybe it'll help.

我必须设置特定数量的线程来一次运行.如果我让我的程序一次将所有 100 个条目作为单独的线程运行,我也会遇到速率限制错误.

I had to set a specific amount of threads to run at one time. If I let my program run all 100 entries at one time as separate threads, I run into the rate limit error as well.

我完全不了解,但是在我的C#程序中,我运行了3个线程(用户可定义,默认3个)

I don't know well at all, but in my C# program, I run 3 threads (definable by the user, 3 is default)

opts = new ParallelOptions { MaxDegreeOfParallelism = 3 };
var checkforfinished = 
Parallel.ForEach(lstBackupUsers.Items.Cast<ListViewItem>(), opts, name => {
{ // my logic code here }

我进行了快速搜索,发现 Java 8(不确定您是否正在使用)支持 Parallel().forEach(),也许这会对您有所帮助.我为此找到的资源位于:http://radar.oreilly.com/2015/02/java-8-streams-api-and-parallelism.html

I did a quick search and found that Java 8 (not sure if that's what you're using) supports Parallel().forEach(), maybe that'd help you. The resource I found for this is at: http://radar.oreilly.com/2015/02/java-8-streams-api-and-parallelism.html

希望这会有所帮助,轮流尝试在 SO 上帮助其他人,就像人们帮助我一样!

Hope this helps, taking my turns trying to help others on SO as people have helped me!

相关文章