From 8ac9a2f752fe2c66b611b286d9215523c48ad05c Mon Sep 17 00:00:00 2001 From: Leo Tenenbaum Date: Sat, 7 Dec 2019 21:59:34 -0500 Subject: more docs --- README.html | 65 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++- README.md | 59 +++++++++++++++++++++++++++++++++++++++++++++++++++++-- docs.sh | 2 +- main.c | 2 ++ parse.c | 2 +- 5 files changed, 125 insertions(+), 5 deletions(-) diff --git a/README.html b/README.html index fe1fd75..e34ab71 100644 --- a/README.html +++ b/README.html @@ -28,4 +28,67 @@ it is nearly as fast in theory.

tests has some test programs written in toc.

-

To compile the compiler on a Unix-y system, use

+

To compile the compiler on a Unix-y system, just run build.sh. You can supply a compiler by running CC=tcc build.sh, or built it in release mode with ./build.sh release (which will help speed up compiling large programs).

+ +

On other systems, you can just compile main.c with a C compiler. toc uses several C99 and a couple of C11 features, so it might not work on all compilers. But it does compile on quite a few, including clang, gcc, and tcc. It can also be compiled as if it were C++, but it does break the standard in a few places*. So, MSVC can also compile it. The outputted code should be C99-compliant.

+ +
+ +

toc Source Code

+ +

toc is written in C, for speed and portability. It has no dependencies, other than the C runtime library.

+ +

Build system

+ +

toc is set up as a unity build, meaning that there is only one translation unit. So, main.c #includes toc.c, which #includes all of toc's files. This improves (from scratch) compilation speeds, since you don't have to include headers a bunch of times for each translation unit. This is more of a problem in C++, where, for example, doing #include <map> ends up turning into 25,000 lines after preprocessing. All of toc's source code, which includes most of the C standard library, at the time of this writing (Dec 2019) is only 22,000 lines after preprocessing; imagine including all of that once for each translation unit which includes map. It also obviates the need for fancy build systems like CMake.

+ +

New features

+ +

Here are all the C99 features which toc depends on (I might have forgotten some...):

+ + + +

The last three of those could all be removed fairly easily.

+ +

And here are all of its C11 features:

+ + + +

More

+ +

See main.c for a bit more information.

+ +
+ +

* for those curious, it has to do with goto. In C, this program:

+ +

+int main() {  
+    goto label;  
+    int x = 5;  
+    label:  
+    return 0;  
+}
+
+ +

Is completely fine. x will hold an unspecified value after the jump (but it isn't used so it doesn't really matter). Apparently, in C++, this is an ill-formed program. This is a bit ridiculous since

+ +

+int main() {  
+    goto label;  
+    int x; x = 5;  
+    label:  
+    return 0;  
+}
+
+ +

is fine. So that's an interesting little "fun fact": int x = 5; isn't always the same as int x; x = 5; in C++.

diff --git a/README.md b/README.md index 4cbe8a7..9720134 100644 --- a/README.md +++ b/README.md @@ -23,9 +23,64 @@ x : int; x = 5; // Declare x as an integer, then set it to 5. `toc` is statically typed and has many of C's features, but it is nearly as fast in theory. - + See `docs` for more information (in progress). `tests` has some test programs written in `toc`. -To compile the compiler on a Unix-y system, use +To compile the compiler on a Unix-y system, just run `build.sh`. You can supply a compiler by running `CC=tcc build.sh`, or built it in release mode with `./build.sh release` (which will help speed up compiling large programs). + +On other systems, you can just compile main.c with a C compiler. toc uses several C99 and a couple of C11 features, so it might not work on all compilers. But it does compile on quite a few, including `clang`, `gcc`, and `tcc`. It can also be compiled as if it were C++, but it does break the standard in a few places\*. So, MSVC can also compile it. The *outputted* code should be C99-compliant. + +--- + +### `toc` Source Code + +`toc` is written in C, for speed and portability. It has no dependencies, other than the C runtime library. + +#### Build system +`toc` is set up as a unity build, meaning that there is only one translation unit. So, `main.c` `#include`s `toc.c`, which `#include`s all of `toc`'s files. This improves (from scratch) compilation speeds, since you don't have to include headers a bunch of times for each translation unit. This is more of a problem in C++, where, for example, doing `#include ` ends up turning into 25,000 lines after preprocessing. All of toc's source code, which includes most of the C standard library, at the time of this writing (Dec 2019) is only 22,000 lines after preprocessing; imagine including all of that once for each translation unit which includes `map`. It also obviates the need for fancy build systems like CMake. + +#### New features + +Here are all the C99 features which `toc` depends on (I might have forgotten some...): + +- Declare anywhere +- `stdint.h` +- Non-constant struct literal initializers (e.g. `int x[2] = {y, z};`) +- Variadic macros and `__VA_ARGS__` +- Flexible array members + +The last three of those could all be removed fairly easily. + +And here are all of its C11 features: + +- Anonymous structures/unions +- `max_align_t` and `alignof` - It can still compile without these but it won't technically be standard-compliant + +#### More + +See `main.c` for a bit more information. + +--- + + +\* for those curious, it has to do with `goto`. In C, this program: +

+int main() {  
+	goto label;  
+	int x = 5;  
+	label:  
+	return 0;  
+}
+
+Is completely fine. `x` will hold an unspecified value after the jump (but it isn't used so it doesn't really matter). Apparently, in C++, this is an ill-formed program. This is a bit ridiculous since +

+int main() {  
+	goto label;  
+	int x; x = 5;  
+	label:  
+	return 0;  
+}
+
+is fine. So that's an interesting little "fun fact": `int x = 5;` isn't always the same as `int x; x = 5;` in C++. diff --git a/docs.sh b/docs.sh index 81cde33..87e4112 100755 --- a/docs.sh +++ b/docs.sh @@ -1,4 +1,4 @@ -#!/bin/bash +#!/bin/sh markdown README.md > README.html for x in docs/*.md; do echo $x diff --git a/main.c b/main.c index da2303e..592dc7d 100644 --- a/main.c +++ b/main.c @@ -31,6 +31,8 @@ allow omission of trailing ; in foo ::= fn() {}? #ifdef __cplusplus #define new new_ #define this this_ +#elif __STDC_VERSION__ < 199901 +#define inline #endif #include "toc.c" diff --git a/parse.c b/parse.c index d80afd1..efd78d4 100644 --- a/parse.c +++ b/parse.c @@ -1931,7 +1931,7 @@ static bool parse_file(Parser *p, ParsedFile *f) { return ret; } -#define PARSE_PRINT_LOCATION(l) //fprintf(out, "[l%lu]", (unsigned long)(l).line); +#define PARSE_PRINT_LOCATION(l) /* fprintf(out, "[l%lu]", (unsigned long)(l).line); */ /* in theory, this shouldn't be global, but these functions are mostly for debugging anyways */ static bool parse_printing_after_types; -- cgit v1.2.3