使用 pdf.js 和 ImageData 将 .pdf 渲染到单个 Canvas
我正在尝试使用 PDF.js 读取整个 .pdf 文档,然后在单个画布上呈现所有页面.
I am trying to read an entire .pdf Document using PDF.js and then render all the pages on a single canvas.
我的想法:将每个页面渲染到画布上并获取 ImageData (context.getImageData()),清除画布做下一页.我将所有 ImageDatas 存储在一个数组中,一旦所有页面都在那里,我想将数组中的所有 ImageDatas 放到一个画布上.
My idea: render each page onto a canvas and get the ImageData (context.getImageData()), clear the canvas do the next page. I store all the ImageDatas in an array and once all pages are in there I want to put all the ImageDatas from the array onto a single canvas.
var pdf = null;
PDFJS.disableWorker = true;
var pages = new Array();
//Prepare some things
var canvas = document.getElementById('cv');
var context = canvas.getContext('2d');
var scale = 1.5;
PDFJS.getDocument(url).then(function getPdfHelloWorld(_pdf) {
pdf = _pdf;
//Render all the pages on a single canvas
for(var i = 1; i <= pdf.numPages; i ++){
pdf.getPage(i).then(function getPage(page){
var viewport = page.getViewport(scale);
canvas.width = viewport.width;
canvas.height = viewport.height;
page.render({canvasContext: context, viewport: viewport});
pages[i-1] = context.getImageData(0, 0, canvas.width, canvas.height);
context.clearRect(0, 0, canvas.width, canvas.height);
p.Out("pre-rendered page " + i);
});
}
//Now we have all 'dem Pages in "pages" and need to render 'em out
canvas.height = 0;
var start = 0;
for(var i = 0; i < pages.length; i++){
if(canvas.width < pages[i].width) canvas.width = pages[i].width;
canvas.height = canvas.height + pages[i].height;
context.putImageData(pages[i], 0, start);
start += pages[i].height;
}
});
所以从我理解的方式来看,这应该可行,对吧?当我运行它时,我最终得到的画布足够大,可以包含 pdf 的所有页面,但不显示 pdf...
So from the way I understnad thing this should work, right? When I run this I end up with the canvas that is big enought to contain all the pages of the pdf but doesn't show the pdf...
感谢您的帮助.
推荐答案
我无法说出将 pdf 呈现为画布的代码部分,但我确实看到了一些问题.
I can’t speak to the part of your code that renders the pdf into a canvas, but I do see some problems.
- Every 重置 canvas.width 或 canvas.height 会自动清除画布内容.因此,在顶部,不需要您的 clearRect,因为在您的每个 page.render 之前,canvas.width 都会清除画布.
- 更重要的是,在底部,每次调整画布大小(哎呀!)都会清除您之前的所有 pdf 绘图.
- getImageData() 得到一个 array,其中每个像素由该数组的 4 个连续元素表示(红色然后绿色然后蓝色然后 alpha).由于 getImageData() 是一个数组,所以它没有 pages[i].width 或 pages[i].height——它只有 pages[i].length.该数组长度不能用于确定宽度或高度.
- Every resetting canvas.width or canvas.height automatically clears the canvas contents. So in the top section, your clearRect is not needed because the canvas is cleared by canvas.width prior to your every page.render.
- More importantly, in the bottom section, all your previous pdf drawings are cleared by every canvas resizing (oops!).
- getImageData() gets an array where each pixel is represented by 4 consecutive elements of that array (red then green then blue then alpha). Since getImageData() is an array, so it doesn’t have a pages[i].width or pages[i].height—it only has a pages[i].length. That array length cannot be used to determine widths or heights.
因此,为了让您开始,我首先将您的代码更改为 (非常非常未经测试!):
So to get you started, I would start by changing your code to this (very, very untested!):
var pdf = null;
PDFJS.disableWorker = true;
var pages = new Array();
//Prepare some things
var canvas = document.getElementById('cv');
var context = canvas.getContext('2d');
var scale = 1.5;
var canvasWidth=0;
var canvasHeight=0;
var pageStarts=new Array();
pageStarts[0]=0;
PDFJS.getDocument(url).then(function getPdfHelloWorld(_pdf) {
pdf = _pdf;
//Render all the pages on a single canvas
for(var i = 1; i <= pdf.numPages; i ++){
pdf.getPage(i).then(function getPage(page){
var viewport = page.getViewport(scale);
// changing canvas.width and/or canvas.height auto-clears the canvas
canvas.width = viewport.width;
canvas.height = viewport.height;
page.render({canvasContext: context, viewport: viewport});
pages[i-1] = context.getImageData(0, 0, canvas.width, canvas.height);
// calculate the width of the final display canvas
if(canvas.width>maxCanvasWidth){
maxCanvasWidth=canvas.width;
}
// calculate the accumulated with of the final display canvas
canvasHeight+=canvas.height;
// save the "Y" starting position of this pages[i]
pageStarts[i]=pageStarts[i-1]+canvas.height;
p.Out("pre-rendered page " + i);
});
}
canvas.width=canvasWidth;
canvas.height = canvasHeight; // this auto-clears all canvas contents
for(var i = 0; i < pages.length; i++){
context.putImageData(pages[i], 0, pageStarts[i]);
}
});
或者,这是一种更传统的完成任务的方法:
使用单个显示"画布并允许用户翻阅"每个所需页面.
Use a single "display" canvas and allow the user to "page through" each desired page.
既然您已经开始将每个页面绘制到画布中,为什么不为每个页面保留一个单独的隐藏画布.然后当用户想要查看第 6 页时,您只需将隐藏的画布#6 复制到您的显示画布上.
Since you already start by drawing each page into a canvas, why not keep a separate, hidden canvas for each page. Then when the user wants to see page#6, you just copy the hidden canvas#6 onto your display canvas.
Mozilla 开发人员在此处的 pdfJS 演示中使用了这种方法:http://mozilla.github.com/pdf.js/web/viewer.html
The Mozilla devs use this approach in their pdfJS demo here: http://mozilla.github.com/pdf.js/web/viewer.html
您可以在此处查看查看器的代码:http://mozilla.github.com/pdf.js/web/viewer.js
You can check out the code for the viewer here: http://mozilla.github.com/pdf.js/web/viewer.js
相关文章