将大型 JSON 文件存储到 Oracle DB 中
我使用以下脚本从 MongoDB 读取数据作为 JSON 文件.
I'm using the following script in order to read data from MongoDB as JSON files.
DECLARE
l_param_list VARCHAR2(512);
l_http_request UTL_HTTP.req;
l_http_response UTL_HTTP.resp;
l_response_text CLOB;
--l_response_text VARCHAR2(32767);
l_list json_list;
A_id VARCHAR2(100);
Photo VARCHAR2(32767);
A_Name VARCHAR2(100);
Remarks VARCHAR2(100);
Status VARCHAR2(100);
UserId VARCHAR2(100);
A_Date VARCHAR2(100);
A_Time VARCHAR2(100);
MSG_status VARCHAR2(100);
Oracle_Flag VARCHAR2(100);
acl VARCHAR2(100);
BEGIN
-- service's input parameters
-- preparing Request...
l_http_request := UTL_HTTP.begin_request('https://api.appery.io/rest/1/db/collections/Photos?where=%7B%22Oracle_Flag%22%3A%22Y%22%7D'
, 'GET'
, 'HTTP/1.1');
-- ...set header's attributes
UTL_HTTP.set_header(l_http_request, 'X-Appery-Database-Id', '53f2dac5e4b02cca64021dbe');
--UTL_HTTP.set_header(l_http_request, 'Content-Length', LENGTH(l_param_list));
-- ...set input parameters
-- UTL_HTTP.write_text(l_http_request, l_param_list);
-- get Response and obtain received value
l_http_response := UTL_HTTP.get_response(l_http_request);
UTL_HTTP.read_text(l_http_response, l_response_text);
DBMS_OUTPUT.put_line(l_response_text);
l_list := json_list(l_response_text);
FOR i IN 1..l_list.count
LOOP
A_id := json_ext.get_string(json(l_list.get(i)),'_id');
Photo := json_ext.get_string(json(l_list.get(i)),'Photo');
A_Name := json_ext.get_string(json(l_list.get(i)),'Name');
Remarks := json_ext.get_string(json(l_list.get(i)),'Remarks');
Status := json_ext.get_string(json(l_list.get(i)),'Status');
UserId := json_ext.get_string(json(l_list.get(i)),'UserId');
A_Date := json_ext.get_string(json(l_list.get(i)),'Date');
A_Time := json_ext.get_string(json(l_list.get(i)),'Time');
MSG_status := json_ext.get_string(json(l_list.get(i)),'MSG_status');
Oracle_Flag := json_ext.get_string(json(l_list.get(i)),'Oracle_Flag');
acl := json_ext.get_string(json(l_list.get(i)),'acl');
insert into Appery_Photos values(
A_id,
Photo,
A_Name,
Remarks,
Status,
UserId,
A_Date,
A_Time,
MSG_status ,
Oracle_Flag,
acl
);
end loop;
-- finalizing
UTL_HTTP.end_response(l_http_response);
EXCEPTION
WHEN UTL_HTTP.end_of_body
THEN UTL_HTTP.end_response(l_http_response);
END;
/
该脚本适用于小型 JSON 文件.但是,当文件包含 base64 文件(以 base64 格式表示的照片)时,脚本无法给出错误(未找到字符串结尾).
The script works fine with small JSON files. However, when the file contains base64 file (photos represeted in base64 formate) the script fails giving the error (string ending not found).
显然,错误的原因是没有复制整个文件,因此 JSON 解析器找不到字符串]"的结尾;或}".
Apparently, the error results from the fact that not the entire file is copied, such that the JSON parser cannot find the end of the string "]" or "}".
我尝试使用最大大小为 32767 的 CLOB 和 VARCHAR2,但这根本不够.
I tried using CLOB and VARCHAR2 with maximum size 32767 but that is not enough at all.
我想解码 base64 文件,但问题是我需要先读取文件才能解码该字段.
I thought of decoding base64 file, but the problme is I need to read the file first before I can decode that field.
任何建议将不胜感激.
这两个答案都提供了读取大型 JSON 文件 (>32KB) 的解决方案,我使用了 @Jeffrey Kemp 之一.然而,作为下一个问题,其中一个 json_values/fields 本身是 >32KB,json_ext.get_string 只返回一个 VARCHAR2,这意味着它被限制为最大 32767 个字节.因此,对于值大于 32KB 的 Photo 字段,我使用了 json_ext.get_json_value 和 dbms_lob.createtemporary.稍微整理一下后得到的相关脚本如下:
Both answers provide solutions for reading large JSON files (>32KB), I used @Jeffrey Kemp one. However, as a next problem, one of the json_values/fields itself is >32KB, the json_ext.get_string only returns a VARCHAR2, which means it is limited to 32767 byes max. So for the field Photo, where the value is >32KB I used json_ext.get_json_value along with dbms_lob.createtemporary. The resultant relevant script after tidy up a bit is as follows:
DECLARE
l_val json_value;
l_param_list VARCHAR2(512);
l_http_request UTL_HTTP.req;
l_http_response UTL_HTTP.resp;
l_response_text CLOB;
--l_response_text VARCHAR2(32767);
l_list json_list;
A_id VARCHAR2(100);
Photo VARCHAR2(32767);
A_Name VARCHAR2(100);
Remarks VARCHAR2(100);
Status VARCHAR2(100);
UserId VARCHAR2(100);
A_Date VARCHAR2(100);
A_Time VARCHAR2(100);
MSG_status VARCHAR2(100);
Oracle_Flag VARCHAR2(100);
acl VARCHAR2(100);
BEGIN
-- service's input parameters
-- preparing Request...
l_http_request := UTL_HTTP.begin_request('https://api.appery.io/rest/1/db/collections/Photos?where=%7B%22Oracle_Flag%22%3A%22Y%22%7D'
, 'GET'
, 'HTTP/1.1');
-- ...set header's attributes
UTL_HTTP.set_header(l_http_request, 'X-Appery-Database-Id', '53f2dac5e4b02cca64021dbe');
--UTL_HTTP.set_header(l_http_request, 'Content-Length', LENGTH(l_param_list));
-- ...set input parameters
-- UTL_HTTP.write_text(l_http_request, l_param_list);
-- get Response and obtain received value
l_http_response := UTL_HTTP.get_response(l_http_request);
BEGIN
LOOP
UTL_HTTP.read_text(l_http_response, buf);
l_response_text := l_response_text || buf;
END LOOP;
EXCEPTION
WHEN UTL_HTTP.end_of_body THEN
NULL;
END;
l_list := json_list(l_response_text);
FOR i IN 1..l_list.count
LOOP
A_id := json_ext.get_string(json(l_list.get(i)),'_id');
l_val := json_ext.get_json_value(json(l_list.get(i)),'Photo');
dbms_lob.createtemporary(Photo, true, 2);
json_value.get_string(l_val, Photo);
A_Name := json_ext.get_string(json(l_list.get(i)),'Name');
Remarks := json_ext.get_string(json(l_list.get(i)),'Remarks');
Status := json_ext.get_string(json(l_list.get(i)),'Status');
UserId := json_ext.get_string(json(l_list.get(i)),'UserId');
A_Date := json_ext.get_string(json(l_list.get(i)),'Date');
A_Time := json_ext.get_string(json(l_list.get(i)),'Time');
MSG_status := json_ext.get_string(json(l_list.get(i)),'MSG_status');
Oracle_Flag := json_ext.get_string(json(l_list.get(i)),'Oracle_Flag');
acl := json_ext.get_string(json(l_list.get(i)),'acl');
insert into Appery_Photos values(
A_id,
Photo,
A_Name,
Remarks,
Status,
UserId,
A_Date,
A_Time,
MSG_status ,
Oracle_Flag,
acl
);
end loop;
-- finalizing
UTL_HTTP.end_response(l_http_response);
EXCEPTION
WHEN UTL_HTTP.end_of_body
THEN UTL_HTTP.end_response(l_http_response);
END;
/
推荐答案
您的问题在于您对 UTL_HTTP.read_text
的调用.您正在传递一个 CLOB,但 read_text 只接受 VARCHAR2,因此它最多可以返回 32k 字节.
Your problem is in your call to UTL_HTTP.read_text
. You are passing a CLOB, but read_text only accepts VARCHAR2, so it can return a maximum of 32k bytes.
您需要使用 VARCHAR2 缓冲区在循环中调用它,并将结果连接到您的 CLOB 中,例如:
You need to call it in a loop, using a VARCHAR2 buffer, and concatenate the results into your CLOB, e.g.:
DECLARE
buf VARCHAR2(32767);
BEGIN
LOOP
UTL_HTTP.read_text(l_http_response, buf);
l_response_text := l_response_text || buf;
END LOOP;
EXCEPTION
WHEN UTL_HTTP.end_of_body THEN
NULL;
END;
http://docs.oracle.com/cd/E11882_01/appdev.112/e40758/u_http.htm#ARPLS71074
你的第二个问题是 json_ext.get_string
只返回一个 VARCHAR2,这意味着它被限制为最大 32767 个字节.我浏览了 PL/json wiki,您可能需要访问向其中一位作者询问如何使用它来获取 CLOB 值.
Your second problem is that json_ext.get_string
only returns a VARCHAR2, which means it is limited to 32767 byes max. I've had a browse of the PL/json wiki, you might need to reach out to one of the authors to find out how to use it to get a CLOB value.
相关文章