资源说明:Cord (rope) implementation for D programming language, with rich high performance string processing library
" ~ title,
which then will be used for last concatenation and forgotten. Compiler is smart here.
Suppose s holds now 50 characters;
Line b:
Now we asked compiler to append "
Cords for D version 1.0
BSD License
PREAMBLE Problem
=================================
Suppose you want to prepare HTML document in your web server written in D.
Let it be simple on the first try:
string g(string x) {
return x ~ " - " ~ x ~ "";
}
string extraargs(string s) {
return "s=2";
}
string generate_1(string title, string[] points) {
string s = "" ~ title ~ " "; // line a
if (points.length > 0) {
s ~= "- "; // line b
foreach (point; points) { // line c
s ~= "
- " ~ g(point) ~ " "; // line d } s ~= "
- " because we want to create list.
But s array is too small to hold this additional 4 bytes. Compiler
creates new array with size 4 bytes larger, copies whole 50 characters from original
s and then on the end of new array 4 bytes from appended text.
OK, we now see something is going wrong.
Line c and d:
We are now iterating lists of points, for each of it we first create "
- " some function g on point, and " ". Then we concatenate this 3 arrays (without creating temporaries), and resulting array (which is temporary) append to s, by resizing s and copying it. Line e: Same as b, but now to just append 5 bytes, we need to copy maybe few kilobytes (or megabytes!) of data. Line f: Same as e Line g: We want here to prepend and append web page with html tags, additionally html tag have some attributes depended on just generated content of web page. Compiler first calculates values of this 5 arrays, take their length, create new array with the size which is a sum of thees lengths, and copy all of them into proper places. Such simple example shows us that something is bad in text processing here. We are creating lots of temporary arrays and do excessive copings and allocations. This simple function can quickly be of O(n^2) complexity, because of this hidden costs. What if we have ten thousands of points? There is multiple possibilities. The main problem here is that array s is created always only as big as to hold needed data. But we know that it will be appended extensively. We can estimate size of this array, or resize it more aggressively. How to estimate size ? We can just guess: maybe 100 + title.length + points.length * 100 will be sufficient approximation? Or maybe we should count it more precisely? Try 2: string extraargs(char[] s) { return "s=2"; } size_t g_size(string x) { return x.length + " - ".length + x.length + "".length; } size_t generate_2_size(string title, string[] points) { size_t s = "
- ".length + g_size(point) + " ".length; // line d } s += "
- " ~ g(point) ~ " "); // line d } append("
- " ~ Cord(g(point)) ~ " "; // line d } s ~= "
- ".length; // line b
foreach (point; points) { // line c
s += "
- "); // line b
foreach (point; points) { // line c
append("
- "; // line b
foreach (point; points) { // line c
s ~= "
本源码包内暂不包含可直接显示的源代码文件,请下载源码包。
English
