1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
|
<h2>toc</h2>
<p><code>toc</code> is a language which compiles to C.</p>
<hr />
<h3>About</h3>
<p><code>toc</code> is currently in development. <strong>It is not a stable language,
and there are almost definitely bugs right now.</strong>
I would recommend against using it for anything big or important.
Many parts of it may change in the future.</p>
<p><code>toc</code> improves on C's syntax (and semantics) in many ways,
To declare <code>x</code> as an integer and set it to 5,
you can do:</p>
<p><code>
x := 5; // Declare x and set x to 5 (infer type) <br />
x : int = 5; // Explicitly make the type int. <br />
x : int; x = 5; // Declare x as an integer, then set it to 5.
</code></p>
<p><code>toc</code> is statically typed and has many of C's features, but
it is nearly as fast in theory.</p>
<p>See <code>docs</code> for more information (in progress).</p>
<p><code>tests</code> has some test programs written in <code>toc</code>.</p>
<p>To compile the compiler on a Unix-y system, just run <code>./build.sh release</code>. You can supply a compiler by running <code>CC=tcc ./build.sh release</code>, or build it in debug mode without the <code>release</code>.</p>
<p>On other systems, you can just compile main.c with a C compiler. <code>toc</code> uses several C99 and a couple of C11 features, so it might not work on all compilers. But it does compile on quite a few, including <code>clang</code>, <code>gcc</code>, and <code>tcc</code>. It can also be compiled as if it were C++, but it does break the standard in a few places*. So, MSVC can also compile it. The <em>outputted</em> code should be C99-compliant.</p>
<h4>Why it compiles to C</h4>
<p><code>toc</code> compiles to C for three reasons:</p>
<ul>
<li>Speed. C is one of the most performant programming languages out there. It also has compilers which are very good at optimizing (better than anything I could write). </li>
<li>Portability. C is probably the most portable language. It has existed for >30 years and can run on practically anything. Furthermore, all major languages nowadays can call functions written in C.</li>
<li>Laziness. I don't really want to deal with writing something which outputs machine code, and it would certainly be more buggy than something which outputs C.</li>
</ul>
<hr />
<h3><code>toc</code> Source Code</h3>
<p><code>toc</code> is written in C, for speed and portability. It has no dependencies, other than the C runtime library.</p>
<h4>Build system</h4>
<p><code>toc</code> is set up as a unity build, meaning that there is only one translation unit. So, <code>main.c</code> <code>#include</code>s <code>toc.c</code>, which <code>#include</code>s all of <code>toc</code>'s files.</p>
<h5>Why?</h5>
<p>This improves compilation speeds (especially from scratch), since you don't have to include headers a bunch of times for each translation unit. This is more of a problem in C++, where, for example, doing <code>#include <map></code> ends up turning into 25,000 lines after preprocessing. All of toc's source code, which includes most of the C standard library, at the time of this writing (Dec 2019) is only 22,000 lines after preprocessing; imagine including all of that once for each translation unit which includes <code>map</code>. It also obviates the need for fancy build systems like CMake.</p>
<h4>New features</h4>
<p>Here are all the C99 features which <code>toc</code> depends on (I might have forgotten some...):</p>
<ul>
<li>Declare anywhere</li>
<li><code>inttypes.h</code></li>
<li>Non-constant struct literal initializers (e.g. <code>int x[2] = {y, z};</code>)</li>
<li>Flexible array members</li>
</ul>
<p>The last three of those could all be removed fairly easily (assuming the system actually has 8-, 16-, 32-, and 64-bit signed and unsigned types).</p>
<p>And here are all of its C11 features:</p>
<ul>
<li>Anonymous structures/unions</li>
<li><code>max_align_t</code> and <code>alignof</code> - It can still compile without these but it won't technically be standard-compliant</li>
</ul>
<h4>More</h4>
<p>See <code>main.c</code> for a bit more information.</p>
<hr />
<h3>Version history</h3>
<p>Here are the major versions of <code>toc</code>.</p>
<table>
<tr><th>Version</th><th>Description</th><th>Date</th></tr>
<tr><td>0.0</td><td>Initial version.</td><td>2019 Dec 6</td></tr>
<tr><td>0.1</td><td>Constant parameter inference.</td><td>2019 Dec 15</td></tr>
<tr><td>0.1.1</td><td>Better constant parameter inference.</td><td>2019 Dec 16</td></tr>
</table>
<hr />
<h3>Report a bug</h3>
<p>If you find a bug, you can report it through <a href="https://github.com/pommicket/toc/issues">GitHub's issue tracker</a>, or by emailing pommicket@gmail.com.</p>
<p>Just send me the <code>toc</code> source code which results in the bug, and I'll try to fix it. </p>
<hr />
<p>* for those curious, it has to do with <code>goto</code>. In C, this program:</p>
<pre><code>
int main() {
goto label;
int x = 5;
label:
return 0;
}
</code></pre>
<p>Is completely fine. <code>x</code> will hold an unspecified value after the jump (but it isn't used so it doesn't really matter). Apparently, in C++, this is an ill-formed program. This is a bit ridiculous since</p>
<pre><code>
int main() {
goto label;
int x; x = 5;
label:
return 0;
}
</code></pre>
<p>is fine. So that's an interesting little "fun fact": <code>int x = 5;</code> isn't always the same as <code>int x; x = 5;</code> in C++.</p>
|