1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
|
<h2>toc</h2>
<p><code>toc</code> is a language which compiles to C.</p>
<hr />
<h3>About</h3>
<p><code>toc</code> is currently in development. <strong>It is not a stable language,
and there are almost definitely bugs right now.</strong>
I would recommend against using it for anything big or important.
Many parts of it may change in the future.</p>
<p><code>toc</code> improves on C’s syntax (and semantics) in many ways,
To declare <code>x</code> as an integer and set it to 5,
you can do:</p>
<p><code>
x := 5; // Declare x and set x to 5 (infer type)
x : int = 5; // Explicitly make the type int.
x : int; x = 5; // Declare x as an integer, then set it to 5.
</code></p>
<p><code>toc</code> is statically typed and has many of C’s features, but
it is nearly as fast in theory.</p>
<p>See <code>docs</code> for more information (in progress).</p>
<p><code>tests</code> has some test programs written in <code>toc</code>.</p>
<p>To compile the compiler on a Unix-y system, just run <code>./build.sh release</code>. You can supply a compiler by running <code>CC=tcc ./build.sh release</code>, or build it in debug mode without the <code>release</code>.</p>
<p>On other systems, you can just compile main.c with a C compiler. <code>toc</code> uses several C99 and a couple of C11 features, so it might not work on all compilers. But it does compile on quite a few, including <code>clang</code>, <code>gcc</code>, and <code>tcc</code>. It can also be compiled as if it were C++, but it does break the standard in a few places*. So, MSVC can also compile it. The <em>outputted</em> code should be C99-compliant.</p>
<h4>Why it compiles to C</h4>
<p><code>toc</code> compiles to C for three reasons:</p>
<ul>
<li>Speed. C is one of the most performant programming languages out there. It also has compilers which are very good at optimizing (better than anything I could write).</li>
<li>Portability. C is probably the most portable language. It has existed for >30 years and can run on practically anything. Furthermore, all major languages nowadays can call functions written in C.</li>
<li>Laziness. I don’t really want to deal with writing something which outputs machine code, and it would certainly be more buggy than something which outputs C.</li>
</ul>
<hr />
<h3><code>toc</code> Source Code</h3>
<p><code>toc</code> is written in C, for speed and portability. It has no dependencies, other than the C runtime library.</p>
<h4>Build system</h4>
<p><code>toc</code> is set up as a unity build, meaning that there is only one translation unit. So, <code>main.c</code> <code>#include</code>s <code>toc.c</code>, which <code>#include</code>s all of <code>toc</code>’s files.</p>
<h5>Why?</h5>
<p>This improves compilation speeds (especially from scratch), since you don’t have to include headers a bunch of times for each translation unit. This is more of a problem in C++, where, for example, doing <code>#include <map></code> ends up turning into 25,000 lines after preprocessing. All of toc’s source code, which includes most of the C standard library, at the time of this writing (Dec 2019) is only 22,000 lines after preprocessing; imagine including all of that once for each translation unit which includes <code>map</code>. It also obviates the need for fancy build systems like CMake.</p>
<h4>New features</h4>
<p>Here are all the C99 features which <code>toc</code> depends on (I might have forgotten some…):</p>
<ul>
<li>Declare anywhere</li>
<li><code>inttypes.h</code></li>
<li>Non-constant struct literal initializers (e.g. <code>int x[2] = {y, z};</code>)</li>
<li>Flexible array members</li>
</ul>
<p>The last three of those could all be removed fairly easily (assuming the system actually has 8-, 16-, 32-, and 64-bit signed and unsigned types).</p>
<p>And here are all of its C11 features:</p>
<ul>
<li>Anonymous structures/unions</li>
<li><code>max_align_t</code> and <code>alignof</code> - It can still compile without these but it won’t technically be standard-compliant</li>
</ul>
<h4>More</h4>
<p>See <code>main.c</code> for a bit more information.</p>
<hr />
<h3>Version history</h3>
<p>Here are the major versions of <code>toc</code>.</p>
<table>
<thead>
<tr>
<th> Version </th>
<th> Description </th>
<th> Date </th>
</tr>
</thead>
<tbody>
<tr>
<td> 0.0 </td>
<td> Initial version. </td>
<td> 2019 Dec 6 </td>
</tr>
<tr>
<td> 0.1 </td>
<td> Constant parameter inference. </td>
<td> 2019 Dec 15 </td>
</tr>
</tbody>
</table>
<hr />
<h3>Report a bug</h3>
<p>If you find a bug, you can report it through <a href="https://github.com/pommicket/toc/issues">GitHub’s issue tracker</a>, or by emailing pommicket@gmail.com.</p>
<p>Just send me the <code>toc</code> source code which results in the bug, and I’ll try to fix it.</p>
<hr />
<p>* for those curious, it has to do with <code>goto</code>. In C, this program:</p>
<pre><code>
int main() {
goto label;
int x = 5;
label:
return 0;
}
</code></pre>
<p>Is completely fine. <code>x</code> will hold an unspecified value after the jump (but it isn’t used so it doesn’t really matter). Apparently, in C++, this is an ill-formed program. This is a bit ridiculous since</p>
<pre><code>
int main() {
goto label;
int x; x = 5;
label:
return 0;
}
</code></pre>
<p>is fine. So that’s an interesting little “fun fact”: <code>int x = 5;</code> isn’t always the same as <code>int x; x = 5;</code> in C++.</p>
|