将文本拆分为页面并单独呈现 (HTML5)

2022-01-17 00:00:00 javascript html css html5-canvas

假设我们有一个长文本,例如罗密欧与Juliet 和我们想在一个简单的电子阅读器中展示它(没有动画,只有页面和自定义字体大小).有什么方法可以做到这一点?

Let's say we have a long text like Romeo & Juliet and we want to present this in a simple ereader (no animations, only pages and custom font-size). What approaches exist to get this?

到目前为止我想出了什么:

What I have come up with so far:

  • 使用 css3 列,可以将整个文本加载到内存中,以这样一种方式,即单个列占用整个页面的大小.事实证明,这样做非常难以控制,并且需要将整个文本加载到内存中.
  • 使用 css3 区域(不受任何主流浏览器支持)将构成与之前解决方案相同的基本概念,主要区别在于它不会那么难以控制(因为每个列"都是自包含的元素).
  • 在画布上绘制文本可以让您准确知道文本的结束位置,从而根据该位置绘制下一页.优点之一是您只需要将所有文本加载到当前页面(仍然很糟糕,但更好).缺点是无法与文本交互(比如选择文本).
  • 将每个单词放在一个元素中,并给每个元素一个唯一的 id(或在 javascript 中保留一个逻辑引用),然后使用 document.elementFromPoint 找到最后一个元素(单词)在页面上并从该单词开始显示下一页.尽管这对我来说是唯一一个看起来很现实的,但由此产生的开销必须是巨大的.
  • Using css3 columns it would be possible to load the entire text into memory styling it in such a way that a single column takes the size of an entire page. Doing this turned out to be extremely hard to control and requires the entire text to be loaded into memory.
  • Using css3 regions (not supported in any major browser) would constitute the same basic concept as the previous solution, with the major difference that it wouldn't be as hard to control (as every 'column' is a self contained element).
  • Drawing the text on a canvas would allow you to know exactly where the text ends and thus draw the next page based on that. One of the advantages is that you only need to load all the text up to the current page (still bad, but better). The disadvantage is that the text can't be interacted with (Like selecting the text).
  • Place every single word inside an element and give every element a unique id (or keep a logical reference in javascript), next use document.elementFromPoint to find the element(word) which is the last on the page and show the next page onward from that word. Despite this being the only one which seems actually realistic to me, the overhead generated by this has to be immense.

然而,这些似乎都不可接受(第一个没有给予足够的控制以使其工作,第二个尚不支持,第三个很难并且没有文本选择,第四个给出了一个荒谬的开销),所以任何我还没有想到的好方法,或者解决上述方法的一个或多个缺点的方法(是的,我知道这是一个相当开放的问题,但它越开放,产生任何相关的机会就越大答案)?

Yet none of those seems to be acceptable (first didn't give enough control to even get it to work, second isn't supported yet, third is hard and without text selection and fourth gives a ridiculous overhead), so any good approaches I haven't thought of yet, or ways to solve one or more disadvantages of the mentioned methods (yes, I am aware this is a fairly open question, but the more open it is, the higher the chance of producing any relevant answers)?

推荐答案

SVG 可能非常适合您的文本分页

  • SVG 文本实际上是文本——与仅显示文本图片的画布不同.

  • SVG text is actually text -- unlike canvas which displays just a picture of text.

SVG 文本可读、可选择、可搜索.

SVG text is readable, selectable, searchable.

SVG 文本本身不会自动换行,但可以使用 javascript 轻松解决.

SVG text does not auto-wrap natively, but this is easily remedied using javascript.

可以使用灵活的页面大小,因为页面格式是在 javascript 中完成的.

Flexible page sizes are possible because page formatting is done in javascript.

分页不依赖于浏览器的格式.

Pagination does not rely on browser dependent formatting.

文本下载小而高效.只需要下载当前页面的文本.

Text downloads are small and efficient. Only the text for the current page needs to be downloaded.

以下是如何进行 SVG 分页的详细信息和演示:

http://jsfiddle.net/m1erickson/Lf4Vt/

第 1 部分:从服务器上的数据库中有效地获取大约一页的字词

将整个文本存储在数据库中,每行 1 个单词.

Store the entire text in a database with 1 word per row.

每一行(单词)都按单词的顺序顺序索引(单词#1 的索引==1,单词#2 的索引==2,等等).

Each row (word) is sequentially indexed by the word's order (word#1 has index==1, word#2 has index==2, etc).

例如,这会以正确的词序获取整个文本:

For example this would fetch the entire text in proper word order:

// select the entire text of Romeo and Juliet
// "order by wordIndex" causes the words to be in proper order

Select word from RomeoAndJuliet order by wordIndex

如果您假设任何页面在格式化时包含大约 250 个单词,那么此数据库查询将获取第 1 页的前 250 个单词的文本

If you assume any page has contains about 250 words when formatted, then this database query will fetch the first 250 words of text for page#1

// select the first 250 words for page#1

Select top 250 word from RomeoAndJuliet order by wordIndex

现在是好部分!

假设第 1 页在格式化后使用了 212 个单词.然后,当您准备好处理 page#2 时,您可以从 word#213 开始再获取 250 个单词.这样可以快速高效地获取数据.

Let’s say page#1 used 212 words after formatting. Then when you’re ready to process page#2 you can fetch 250 more words starting at word#213. This results in quick and efficient data fetches.

// select 250 more words for page#2
// "where wordIndex>212" causes the fetched words
// to begin with the 213th word in the text

Select top 250 word from RomeoAndJuliet order by wordIndex where wordIndex>212

第 2 部分:将获取的单词格式化为适合指定页面宽度的文本行

每行文本必须包含足够的单词来填充指定页面,但不能更多.

Each line of text must contain enough words to fill the specified page with, but not more.

以单个单词开始第 1 行,然后一次添加 1 个单词,直到文本适合指定的页面宽度.

Start line#1 with a single word and then add words 1-at-a-time until the text fits in the specified page width.

在第一行拟合之后,我们向下移动一个行高并开始第 2 行.

After the first line is fitted, we move down by a line-height and begin line#2.

匹配一行中的单词需要测量一行中添加的每个额外单词.当下一个单词超出行宽时,多余的单词将移至下一行.

Fitting the words on the line requires measuring each additional word added on a line. When the next word would exceed the line width, that extra word is moved to the next line.

可以使用 Html Canvases context.measureText 方法测量一个单词.

A word can be measured using Html Canvases context.measureText method.

此代码将采用一组单词(如从数据库中提取的 250 个单词),并将格式化尽可能多的单词以填充页面大小.

This code will take a set of words (like the 250 words fetched from the database) and will format as many words as possible to fill the page size.

maxWidth 是一行文本的最大像素宽度.

maxWidth is the maximum pixel width of a line of text.

maxLines 是一个页面可以容纳的最大行数.

maxLines is the maximum number of lines that will fit on a page.

function textToLines(words,maxWidth,maxLines,x,y){

    var lines=[];

    while(words.length>0 && lines.length<=maxLines){
        var line=getOneLineOfText(words,maxWidth);
        words=words.splice(line.index+1);
        lines.push(line);
        wordCount+=line.index+1;
    }

    return(lines);
}

function getOneLineOfText(words,maxWidth){
    var line="";
    var space="";
    for(var i=0;i<words.length;i++){
        var testWidth=ctx.measureText(line+" "+words[i]).width;
        if(testWidth>maxWidth){return({index:i-1,text:line});}
        line+=space+words[i];
        space=" ";
    }
    return({index:words.length-1,text:line});
}

第 3 部分:使用 SVG 显示文本行

SVG 文本元素是真正的 html 元素,可以读取、选择和搜索.

The SVG Text element is a true html element that can be read, selected and searched.

SVG Text 元素中的每一行文本都使用 SVG Tspan 元素显示.

Each individual line of text in the SVG Text element is displayed using an SVG Tspan element.

此代码采用第 2 部分中格式化的文本行,并使用 SVG 将这些行显示为文本页面.

This code takes the lines of text which were formatted in Part#2 and displays the lines as a page of text using SVG.

function drawSvg(lines,x){
    var svg = document.createElementNS('http://www.w3.org/2000/svg', 'svg');
    var sText = document.createElementNS('http://www.w3.org/2000/svg', 'text');
    sText.setAttributeNS(null, 'font-family', 'verdana');
    sText.setAttributeNS(null, 'font-size', "14px");
    sText.setAttributeNS(null, 'fill', '#000000');
    for(var i=0;i<lines.length;i++){
        var sTSpan = document.createElementNS('http://www.w3.org/2000/svg', 'tspan');
        sTSpan.setAttributeNS(null, 'x', x);
        sTSpan.setAttributeNS(null, 'dy', lineHeight+"px");
        sTSpan.appendChild(document.createTextNode(lines[i].text));
        sText.appendChild(sTSpan);
    }
    svg.appendChild(sText);
    $page.append(svg);
}

这是完整的代码,以防演示链接中断:

Here is complete code just in case the Demo link breaks:

<!doctype html>
<html>
<head>
<link rel="stylesheet" type="text/css" media="all" href="css/reset.css" /> <!-- reset css -->
<script type="text/javascript" src="http://code.jquery.com/jquery.min.js"></script>
<style>
    body{ background-color: ivory; }
    .page{border:1px solid red;}
</style>
<script>
$(function(){

    var canvas=document.createElement("canvas");
    var ctx=canvas.getContext("2d");
    ctx.font="14px verdana";

    var pageWidth=250;
    var pageHeight=150;
    var pagePaddingLeft=10;
    var pagePaddingRight=10;
    var approxWordsPerPage=500;        
    var lineHeight=18;
    var maxLinesPerPage=parseInt(pageHeight/lineHeight)-1;
    var x=pagePaddingLeft;
    var y=lineHeight;
    var maxWidth=pageWidth-pagePaddingLeft-pagePaddingRight;
    var text="Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.";

    // # words that have been displayed 
    //(used when ordering a new page of words)
    var wordCount=0;

    // size the div to the desired page size
    $pages=$(".page");
    $pages.width(pageWidth)
    $pages.height(pageHeight);


    // Test: Page#1

    // get a reference to the page div
    var $page=$("#page");
    // use html canvas to word-wrap this page
    var lines=textToLines(getNextWords(wordCount),maxWidth,maxLinesPerPage,x,y);
    // create svg elements for each line of text on the page
    drawSvg(lines,x);

    // Test: Page#2 (just testing...normally there's only 1 full-screen page)
    var $page=$("#page2");
    var lines=textToLines(getNextWords(wordCount),maxWidth,maxLinesPerPage,x,y);
    drawSvg(lines,x);

    // Test: Page#3 (just testing...normally there's only 1 full-screen page)
    var $page=$("#page3");
    var lines=textToLines(getNextWords(wordCount),maxWidth,maxLinesPerPage,x,y);
    drawSvg(lines,x);


    // fetch the next page of words from the server database
    // (since we've specified the starting point in the entire text
    //  we only have to download 1 page of text as needed
    function getNextWords(nextWordIndex){
        // Eg: select top 500 word from romeoAndJuliet 
        //     where wordIndex>=nextwordIndex
        //     order by wordIndex
        //
        // But here for testing, we just hardcode the entire text 
        var testingText="Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.";
        var testingWords=testingText.split(" ");
        var words=testingWords.splice(nextWordIndex,approxWordsPerPage);

        // 
        return(words);    
    }


    function textToLines(words,maxWidth,maxLines,x,y){

        var lines=[];

        while(words.length>0 && lines.length<=maxLines){
            var line=getLineOfText(words,maxWidth);
            words=words.splice(line.index+1);
            lines.push(line);
            wordCount+=line.index+1;
        }

        return(lines);
    }

    function getLineOfText(words,maxWidth){
        var line="";
        var space="";
        for(var i=0;i<words.length;i++){
            var testWidth=ctx.measureText(line+" "+words[i]).width;
            if(testWidth>maxWidth){return({index:i-1,text:line});}
            line+=space+words[i];
            space=" ";
        }
        return({index:words.length-1,text:line});
    }

    function drawSvg(lines,x){
        var svg = document.createElementNS('http://www.w3.org/2000/svg', 'svg');
        var sText = document.createElementNS('http://www.w3.org/2000/svg', 'text');
        sText.setAttributeNS(null, 'font-family', 'verdana');
        sText.setAttributeNS(null, 'font-size', "14px");
        sText.setAttributeNS(null, 'fill', '#000000');
        for(var i=0;i<lines.length;i++){
            var sTSpan = document.createElementNS('http://www.w3.org/2000/svg', 'tspan');
            sTSpan.setAttributeNS(null, 'x', x);
            sTSpan.setAttributeNS(null, 'dy', lineHeight+"px");
            sTSpan.appendChild(document.createTextNode(lines[i].text));
            sText.appendChild(sTSpan);
        }
        svg.appendChild(sText);
        $page.append(svg);
    }

}); // end $(function(){});
</script>
</head>
<body>
    <h4>Text split into "pages"<br>(Selectable & Searchable)</h4>
    <div id="page" class="page"></div>
    <h4>Page 2</h4>
    <div id="page2" class="page"></div>
    <h4>Page 3</h4>
    <div id="page3" class="page"></div>
</body>
</html>

相关文章