elibc - new libc with eCos

An Ultra small UTF-8 C Library for embedded system.

o o o o o o o o o o o o o o
|--------------------------o
|                          o
| elibc 0.6.2 with eCos    o
|                          o
|--------------------------o
o o o o o o o o o o o o o o

elibc is an improvement version of libc/Newlib from eCos, also the advanced C runtime environment for embedded system.

Features

Why Another C library?

  1. GNU C Library is not smaller enough for embedded system, size reduced libraries, such as uClibc from μClinux, Newlib from eCos and others are still too big.
  2. Traditional libc is not practical, they are not suitable to today's situation, especially handling UTF-8 encoding, and even lack the most frequently used string and stdio manipulation functions for Unicode data that used to access flash or disk, such as the standard interface: toupper(), tolower(), strlen(), strchr(), strrchr(), strstr(), strtok(), getc(), putc(), fgetc(), fputc(), sprintf(), scanf(), and the extended interface: strip(), capitalize(), split(), join(), replace(), startswith(), endswith(), reverse(), compress(), decompress(), and convenient hash table manipulation functions: e.g. set(k, v), get(k), foreach(), map(), filter(), reduce(), and so on.
  3. There even isn't exists a really lightweight Unicode C library in today's software industry that is designedly developed for embedded system, the existing UTF-8 libraries, e.g. utf8.c, utf8proc, utfcpp, and ICU, and UTF-8 encoding handling routines defined in programming languages, including C, C++, Lua, TCL, Javascript, Ruby, Python, Perl, nesc from TinyOS, Go and Rust from Google, and Java from Sun (Android, BlackBerry and Oracle), all of them are designed for themselves, or too large or too simple, especially, most of them attached a bloated burden, the Unicode data file, refer here and here.
  4. The regular expressions processing libraries, GNU regex library, Henry Spencer's regular expression libraries, Onigmo, Oniguruma, TRE, RE2 from Google, and PCRE, they're excellent, but unfortunately, the same issue, they are not general purpose libc designed for embedded system, they are too big, or attached the big garbage Unicode data, lost concise, flexibility and eating up too much space.
  5. realloc() is not as easy as it looks, dynamic memory management, or GC (Garbage Collector) is a big topic, the algorithm be hided it can be written as a big book. The thread-safe, reliable memory management mechanism is not built in libc.
  6. In many scenarios of embedded system, file system have the ability to deal with Unicode data is an essential, and an in-memory embedded database is also an essential.

Code snippets


uchar u = 0x3042;
char *c = chr(u);
printf("%s\n", c); /*あ*/
const char *c = "あ";
uchar u = ord(c);
printf("%x\n", u); /*3042*/
const char *s = "ÿũēṫ";
char *us = toupper(s);
printf("%s\n", us); /*ŸŨĒṪ*/
const char *s = "ŸŨĒṪ";
char *ls = tolower(s);
printf("%s\n", ls); /*ÿũēṫ*/
strcpy(data, "Be ashamed to die 尊厳と権利と について平等である until you have won some victory for humanity.");
char *ret = strchr(data, "あ");
assert("error, strchr ret != NULL", ret != NULL);
logmsg("strchr() :#%s#\n", ret);
ret = strrchr(data, "あ");
assert("error, strrchr ret != NULL", ret != NULL);
logmsg("strrchr() :#%s#\n", ret);
strcpy(data, " 生まれながらにして自由であり ");
trim(data, 'l');
logmsg("left trim:#%s#\n", data);
trim(data, 'r');
logmsg("right trim:#%s#\n", data);
trim(data, 0);
logmsg("trim both sides:#%s#\n", data);
const char *string = "All human beings are born free and equal in"
"dignity and rights. They are endowed with reason and conscience "
"and should act towards on another in a spirit of brotherhood."
"すべての人間は、生まれながらにして自由であり、かつ、尊厳と権利と について平等である。"
"人間は、理性と良心とを授けられており、互いに同胞の精神をもって行動しなければならない。"
"มนุษย์ทั้งหลายเกิดมามีอิสระและเสมอภาคกันในเกียรติศักด[เกียรติศักดิ์]และสิทธิ "
"ต่างมีเหตุผลและมโนธรรม และควรปฏิบัติต่อกันด้วยเจตนารมณ์แห่งภราดรภาพ"
"모든 인간은 태어날 때부터 자유로우며 그 존엄과 권리에 있어 동등하다. "
"인간은 천부적으로 이성과 양심을 부여받았으며 서로 형제애의 정신으로 행동하여야 한다. ";
const char *needle = "ならない。";
strcpy(data, string);
char *sub = substr(data, 249, 254);
int r = strncmp(needle, sub, strlen(needle));
logmsg("substr:#%s#\n", sub);
assert("error, substr, r == 0", r == 0);
int len = strlen(string);
logmsg("strlen:%d\n", len);
assert("error, strlen == 486", len == 486);
const char *c = charAt(data, 249);
logmsg("charAt:#%s#\n", c);
const char *s = "な";
bool r = strncmp(s, c, strlen(s));
assert("error, charAt r == 0", r == 0);
const char *needle = "理性と良心";
strcpy(data, string);
int i = index(data, needle, 0);
assert("error, index, i == 215", i == 215);
i = rindex(data, needle, 255);
assert("error, rindex, i == 215", i == 215);
char pattern[128];
strcpy(data, "Title 尊厳権利");
strcpy(pattern, "(.*)");
char *token = search(data, pattern, 0);
while (token){
fprintf(stdout, "search:#%s#\n", token);
token = search(NULL, pattern, 0);
}
fetched:#Title 尊厳権利#
-----------------------------------------------------
strcpy(data, "Jack be nimble, Jack be quick, Jack jump over the candlestick.");
strcpy(delim, "( |^)[A-Za-z]*ick");
char *token = split(data, delim, REGMATCHMAX);
while (token){
fprintf(stdout, "split:#%s#\n", token);
token = split(NULL, delim, REGMATCHMAX);
}
strcpy(data, string);
strcpy(delim, "と");
char *token = strtok(data, delim);
while (token){
logmsg("strtok:#%s#\n", token);
token = strtok(NULL, delim);
}
-----------------------------------------------------
strtok:#All human beings are born free and equal indignity and rights.
They are endowed with reason and conscience and should act towards on another
in a spirit of brotherhood.すべての人間は、生まれながらにして自由であり、かつ、尊厳#
strtok:#権利#
strtok:# について平等である。人間は、理性#
strtok:#良心#
strtok:#を授けられており、互いに同胞の精神をもって行動しなければならない。
มนุษย์ทั้งหลายเกิดมามีอิสระและเสมอภาคกันในเกียรติศักด[เกียรติศักดิ์]และสิทธิ
ต่างมีเหตุผลและมโนธรรม และควรปฏิบัติต่อกันด้วยเจตนารมณ์แห่งภราดรภาพ
모든 인간은 태어날 때부터 자유로우며 그 존엄과 권리에 있어 동등하다.
인간은 천부적으로 이성과 양심을 부여받았으며 서로 형제애의 정신으로 행동하여야 한다. #
-----------------------------------------------------
strcpy(data, "すべての人間は、生まれながらにして自由であり");
bool ret = endswith(data, "自由であり");
assert("error, endswith, ret == true", ret == true);
strcpy(data, "尊厳と権利と について平等である。");
reverse(data);
logmsg("reverse:#%s#\n", data);
const char *str = "Be ashamed to die 尊厳と権利と について平等である"
"until you have won some victory for humanity.";
char charset[128];
strcpy(data, str);
//strcpy(charset, "zpm1"); // Finds the 'm'
strcpy(charset, "平等である"); // Finds the '平'
//strcpy(charset, "自由"); // return NULL
char *ret = strpbrk(data, charset);
assert("error, strpbrk, ret != NULL", ret != NULL);
logmsg("strpbrk() :#%s#\n", ret);
//case c, complex replace with regular expressions
strcpy(data, "She sells seashells on the seashore. The seashells she sells are seashore seashells.");
strcpy(rep, "( |^)[A-Za-z]*[Ss]he[a-z]*[ .,$]");
strcpy(with, "&");
char *s = replace(data, rep, with);
logmsg("replace:#%s#\n", s);

Notice

This library is now deprecated because a brand-new text encoding system has been innovated to replace Unicode and UTF-8, the big and stupid system that utilized to play people. Please refer here.

References

  1. PEG (Parsing Expression Grammars) and Regular Expression on wikipedia.
  2. eCos, the free open source real-time operating system.
  3. eCosCentric, founded by Red Hat for eCos commercial support.