-
Notifications
You must be signed in to change notification settings - Fork 12
/
ispc.html
5534 lines (5505 loc) · 335 KB
/
ispc.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="generator" content="Docutils 0.19: https://docutils.sourceforge.io/" />
<title>Intel® ISPC User's Guide</title>
<link rel="stylesheet" href="css/style.css" type="text/css" />
</head>
<body>
<div class="document" id="intel-ispc-user-s-guide">
<div id="wrap">
<div id="wrap2">
<div id="header">
<h1 id="logo">Intel® Implicit SPMD Program Compiler</h1>
<div id="slogan">An open-source compiler for high-performance SIMD programming on
the CPU and GPU</div>
</div>
<div id="nav">
<div id="nbar">
<ul>
<li><a href="index.html">Overview</a></li>
<li><a href="features.html">Features</a></li>
<li><a href="downloads.html">Downloads</a></li>
<li id="selected"><a href="documentation.html">Documentation</a></li>
<li><a href="perf.html">Performance</a></li>
<li><a href="contrib.html">Contributors</a></li>
</ul>
</div>
</div>
<div id="content-wrap">
<div id="sidebar">
<div class="widgetspace">
<h1>Resources</h1>
<ul class="menu">
<li><a href="http://github.com/ispc/ispc">GitHub page</a></li>
<li><a href="https://github.com/ispc/ispc/discussions">Discussions on GitHub</a></li>
<li><a href="http://github.com/ispc/ispc/issues">Issues on Github</a></li>
<li><a href="https://github.com/orgs/ispc/projects/1">Release planning board</a></li>
<li><a href="https://github.com/ispc/ispc/blob/main/CONTRIBUTING.md">Contributing guide</a></li>
<li><a href="http://github.com/ispc/ispc/wiki">Wiki on Github</a></li>
</ul>
</div>
</div>
<h1 class="title">Intel® ISPC User's Guide</h1>
<div id="content">
<p>The Intel® Implicit SPMD Program Compiler (Intel® ISPC) is a compiler for
writing SPMD (single program multiple data) programs to run on the CPU and GPU.
The SPMD
programming approach is widely known to graphics and GPGPU programmers; it
is used for GPU shaders and CUDA* and OpenCL* kernels, for example. The
main idea behind SPMD is that one writes programs as if they were operating
on a single data element (a pixel for a pixel shader, for example), but
then the underlying hardware and runtime system executes multiple
invocations of the program in parallel with different inputs (the values
for different pixels, for example).</p>
<p>The main goals behind <tt class="docutils literal">ispc</tt> are to:</p>
<ul class="simple">
<li>Build a variant of the C programming language that delivers good
performance to performance-oriented programmers who want to run SPMD
programs on CPUs and GPUs.</li>
<li>Provide a thin abstraction layer between the programmer and the
hardware--in particular, to follow the lesson from C for serial programs
of having an execution and data model where the programmer can cleanly
reason about the mapping of their source program to compiled assembly
language and the underlying hardware.</li>
<li>Harness the computational power of the Single Program, Multiple Data (SIMD) vector
units without the extremely low-programmer-productivity activity of directly
writing intrinsics.</li>
<li>Explore opportunities from close-coupling between C/C++ application code
and SPMD <tt class="docutils literal">ispc</tt> code running on the same processor--lightweight function
calls between the two languages, sharing data directly via pointers without
copying or reformatting, etc.</li>
</ul>
<p><strong>We are very interested in your feedback and comments about ispc and
in hearing your experiences using the system. We are especially interested
in hearing if you try using ispc but see results that are not as you
were expecting or hoping for.</strong> We encourage you to send a note with your
experiences or comments to the <a class="reference external" href="https://github.com/ispc/ispc/discussions">GitHub Discussions</a> forum or to file bug or
feature requests with the <tt class="docutils literal">ispc</tt> <a class="reference external" href="https://github.com/ispc/ispc/issues?state=open">bug tracker</a>. (Thanks!)</p>
<p>Contents:</p>
<ul class="simple">
<li><a class="reference internal" href="#recent-changes-to-ispc">Recent Changes to ISPC</a><ul>
<li><a class="reference internal" href="#updating-ispc-programs-for-changes-in-ispc-1-1">Updating ISPC Programs For Changes In ISPC 1.1</a></li>
<li><a class="reference internal" href="#updating-ispc-programs-for-changes-in-ispc-1-2">Updating ISPC Programs For Changes In ISPC 1.2</a></li>
<li><a class="reference internal" href="#updating-ispc-programs-for-changes-in-ispc-1-3">Updating ISPC Programs For Changes In ISPC 1.3</a></li>
<li><a class="reference internal" href="#updating-ispc-programs-for-changes-in-ispc-1-5-0">Updating ISPC Programs For Changes In ISPC 1.5.0</a></li>
<li><a class="reference internal" href="#updating-ispc-programs-for-changes-in-ispc-1-6-0">Updating ISPC Programs For Changes In ISPC 1.6.0</a></li>
<li><a class="reference internal" href="#updating-ispc-programs-for-changes-in-ispc-1-7-0">Updating ISPC Programs For Changes In ISPC 1.7.0</a></li>
<li><a class="reference internal" href="#updating-ispc-programs-for-changes-in-ispc-1-8-2">Updating ISPC Programs For Changes In ISPC 1.8.2</a></li>
<li><a class="reference internal" href="#updating-ispc-programs-for-changes-in-ispc-1-9-0">Updating ISPC Programs For Changes In ISPC 1.9.0</a></li>
<li><a class="reference internal" href="#updating-ispc-programs-for-changes-in-ispc-1-9-1">Updating ISPC Programs For Changes In ISPC 1.9.1</a></li>
<li><a class="reference internal" href="#updating-ispc-programs-for-changes-in-ispc-1-9-2">Updating ISPC Programs For Changes In ISPC 1.9.2</a></li>
<li><a class="reference internal" href="#updating-ispc-programs-for-changes-in-ispc-1-10-0">Updating ISPC Programs For Changes In ISPC 1.10.0</a></li>
<li><a class="reference internal" href="#updating-ispc-programs-for-changes-in-ispc-1-11-0">Updating ISPC Programs For Changes In ISPC 1.11.0</a></li>
<li><a class="reference internal" href="#updating-ispc-programs-for-changes-in-ispc-1-12-0">Updating ISPC Programs For Changes In ISPC 1.12.0</a></li>
<li><a class="reference internal" href="#updating-ispc-programs-for-changes-in-ispc-1-13-0">Updating ISPC Programs For Changes In ISPC 1.13.0</a></li>
<li><a class="reference internal" href="#updating-ispc-programs-for-changes-in-ispc-1-14-0">Updating ISPC Programs For Changes In ISPC 1.14.0</a></li>
<li><a class="reference internal" href="#updating-ispc-programs-for-changes-in-ispc-1-14-1">Updating ISPC Programs For Changes In ISPC 1.14.1</a></li>
<li><a class="reference internal" href="#updating-ispc-programs-for-changes-in-ispc-1-15-0">Updating ISPC Programs For Changes In ISPC 1.15.0</a></li>
<li><a class="reference internal" href="#updating-ispc-programs-for-changes-in-ispc-1-16-0">Updating ISPC Programs For Changes In ISPC 1.16.0</a></li>
<li><a class="reference internal" href="#updating-ispc-programs-for-changes-in-ispc-1-17-0">Updating ISPC Programs For Changes In ISPC 1.17.0</a></li>
<li><a class="reference internal" href="#updating-ispc-programs-for-changes-in-ispc-1-18-0">Updating ISPC Programs For Changes In ISPC 1.18.0</a></li>
<li><a class="reference internal" href="#updating-ispc-programs-for-changes-in-ispc-1-19-0">Updating ISPC Programs For Changes In ISPC 1.19.0</a></li>
<li><a class="reference internal" href="#updating-ispc-programs-for-changes-in-ispc-1-20-0">Updating ISPC Programs For Changes In ISPC 1.20.0</a></li>
<li><a class="reference internal" href="#updating-ispc-programs-for-changes-in-ispc-1-21-0">Updating ISPC Programs For Changes In ISPC 1.21.0</a></li>
<li><a class="reference internal" href="#updating-ispc-programs-for-changes-in-ispc-1-22-0">Updating ISPC Programs For Changes In ISPC 1.22.0</a></li>
<li><a class="reference internal" href="#updating-ispc-programs-for-changes-in-ispc-1-23-0">Updating ISPC Programs For Changes In ISPC 1.23.0</a></li>
<li><a class="reference internal" href="#updating-ispc-programs-for-changes-in-ispc-1-24-0">Updating ISPC Programs For Changes In ISPC 1.24.0</a></li>
<li><a class="reference internal" href="#updating-ispc-programs-for-changes-in-ispc-1-25-0">Updating ISPC Programs For Changes In ISPC 1.25.0</a></li>
</ul>
</li>
<li><a class="reference internal" href="#getting-started-with-ispc">Getting Started with ISPC</a><ul>
<li><a class="reference internal" href="#installing-ispc">Installing ISPC</a></li>
<li><a class="reference internal" href="#compiling-and-running-a-simple-ispc-program">Compiling and Running a Simple ISPC Program</a></li>
</ul>
</li>
<li><a class="reference internal" href="#using-the-ispc-compiler">Using The ISPC Compiler</a><ul>
<li><a class="reference internal" href="#basic-command-line-options">Basic Command-line Options</a></li>
<li><a class="reference internal" href="#selecting-the-compilation-target">Selecting The Compilation Target</a></li>
<li><a class="reference internal" href="#selecting-32-or-64-bit-addressing">Selecting 32 or 64 Bit Addressing</a></li>
<li><a class="reference internal" href="#the-preprocessor">The Preprocessor</a></li>
<li><a class="reference internal" href="#debugging">Debugging</a></li>
<li><a class="reference internal" href="#optimization-settings">Optimization Settings</a></li>
<li><a class="reference internal" href="#other-ways-of-passing-arguments-to-ispc">Other ways of passing arguments to ISPC</a></li>
</ul>
</li>
<li><a class="reference internal" href="#the-ispc-parallel-execution-model">The ISPC Parallel Execution Model</a><ul>
<li><a class="reference internal" href="#basic-concepts-program-instances-and-gangs-of-program-instances">Basic Concepts: Program Instances and Gangs of Program Instances</a></li>
<li><a class="reference internal" href="#control-flow-within-a-gang">Control Flow Within A Gang</a><ul>
<li><a class="reference internal" href="#control-flow-example-if-statements">Control Flow Example: If Statements</a></li>
<li><a class="reference internal" href="#control-flow-example-loops">Control Flow Example: Loops</a></li>
<li><a class="reference internal" href="#gang-convergence-guarantees">Gang Convergence Guarantees</a></li>
</ul>
</li>
<li><a class="reference internal" href="#uniform-data">Uniform Data</a><ul>
<li><a class="reference internal" href="#uniform-control-flow">Uniform Control Flow</a></li>
<li><a class="reference internal" href="#uniform-variables-and-varying-control-flow">Uniform Variables and Varying Control Flow</a></li>
</ul>
</li>
<li><a class="reference internal" href="#data-races-within-a-gang">Data Races Within a Gang</a></li>
<li><a class="reference internal" href="#tasking-model">Tasking Model</a></li>
</ul>
</li>
<li><a class="reference internal" href="#the-ispc-language">The ISPC Language</a><ul>
<li><a class="reference internal" href="#relationship-to-the-c-programming-language">Relationship To The C Programming Language</a></li>
<li><a class="reference internal" href="#lexical-structure">Lexical Structure</a><ul>
<li><a class="reference internal" href="#integer-literals">Integer Literals</a></li>
<li><a class="reference internal" href="#floating-point-literals">Floating Point Literals</a></li>
<li><a class="reference internal" href="#string-literals">String Literals</a></li>
</ul>
</li>
<li><a class="reference internal" href="#types">Types</a><ul>
<li><a class="reference internal" href="#basic-types-and-type-qualifiers">Basic Types and Type Qualifiers</a></li>
<li><a class="reference internal" href="#signed-and-unsigned-integer-types">Signed and Unsigned Integer Types</a></li>
<li><a class="reference internal" href="#uniform-and-varying-qualifiers">"uniform" and "varying" Qualifiers</a></li>
<li><a class="reference internal" href="#defining-new-names-for-types">Defining New Names For Types</a></li>
<li><a class="reference internal" href="#pointer-types">Pointer Types</a></li>
<li><a class="reference internal" href="#function-pointer-types">Function Pointer Types</a></li>
<li><a class="reference internal" href="#reference-types">Reference Types</a></li>
<li><a class="reference internal" href="#enumeration-types">Enumeration Types</a></li>
<li><a class="reference internal" href="#short-vector-types">Short Vector Types</a></li>
<li><a class="reference internal" href="#array-types">Array Types</a></li>
<li><a class="reference internal" href="#struct-types">Struct Types</a><ul>
<li><a class="reference internal" href="#operators-overloading">Operators Overloading</a></li>
</ul>
</li>
<li><a class="reference internal" href="#structure-of-array-types">Structure of Array Types</a></li>
</ul>
</li>
<li><a class="reference internal" href="#declarations-and-initializers">Declarations and Initializers</a></li>
<li><a class="reference internal" href="#attributes">Attributes</a><ul>
<li><a class="reference internal" href="#noescape">noescape</a></li>
<li><a class="reference internal" href="#address-space">address_space</a></li>
<li><a class="reference internal" href="#unmangled">unmangled</a></li>
</ul>
</li>
<li><a class="reference internal" href="#expressions">Expressions</a><ul>
<li><a class="reference internal" href="#dynamic-memory-allocation">Dynamic Memory Allocation</a></li>
<li><a class="reference internal" href="#type-casting">Type Casting</a></li>
</ul>
</li>
<li><a class="reference internal" href="#control-flow">Control Flow</a><ul>
<li><a class="reference internal" href="#conditional-statements-if">Conditional Statements: "if"</a></li>
<li><a class="reference internal" href="#conditional-statements-switch">Conditional Statements: "switch"</a></li>
<li><a class="reference internal" href="#iteration-statements">Iteration Statements</a><ul>
<li><a class="reference internal" href="#basic-iteration-statements-for-while-and-do">Basic Iteration Statements: "for", "while", and "do"</a></li>
<li><a class="reference internal" href="#iteration-over-active-program-instances-foreach-active">Iteration over active program instances: "foreach_active"</a></li>
<li><a class="reference internal" href="#iteration-over-unique-elements-foreach-unique">Iteration over unique elements: "foreach_unique"</a></li>
<li><a class="reference internal" href="#parallel-iteration-statements-foreach-and-foreach-tiled">Parallel Iteration Statements: "foreach" and "foreach_tiled"</a></li>
<li><a class="reference internal" href="#parallel-iteration-with-programindex-and-programcount">Parallel Iteration with "programIndex" and "programCount"</a></li>
</ul>
</li>
<li><a class="reference internal" href="#unstructured-control-flow-goto">Unstructured Control Flow: "goto"</a></li>
<li><a class="reference internal" href="#coherent-control-flow-statements-cif-and-friends">"Coherent" Control Flow Statements: "cif" and Friends</a></li>
<li><a class="reference internal" href="#functions-and-function-calls">Functions and Function Calls</a><ul>
<li><a class="reference internal" href="#function-overloading">Function Overloading</a></li>
</ul>
</li>
<li><a class="reference internal" href="#re-establishing-the-execution-mask">Re-establishing The Execution Mask</a></li>
<li><a class="reference internal" href="#task-parallel-execution">Task Parallel Execution</a><ul>
<li><a class="reference internal" href="#task-parallelism-launch-and-sync-statements">Task Parallelism: "launch" and "sync" Statements</a></li>
<li><a class="reference internal" href="#task-parallelism-runtime-requirements">Task Parallelism: Runtime Requirements</a></li>
</ul>
</li>
</ul>
</li>
<li><a class="reference internal" href="#llvm-intrinsic-functions">LLVM Intrinsic Functions</a></li>
<li><a class="reference internal" href="#function-templates">Function Templates</a></li>
</ul>
</li>
<li><a class="reference internal" href="#the-ispc-standard-library">The ISPC Standard Library</a><ul>
<li><a class="reference internal" href="#basic-operations-on-data">Basic Operations On Data</a><ul>
<li><a class="reference internal" href="#logical-and-selection-operations">Logical and Selection Operations</a></li>
<li><a class="reference internal" href="#bit-operations">Bit Operations</a></li>
</ul>
</li>
<li><a class="reference internal" href="#math-functions">Math Functions</a><ul>
<li><a class="reference internal" href="#basic-math-functions">Basic Math Functions</a></li>
<li><a class="reference internal" href="#transcendental-functions">Transcendental Functions</a></li>
<li><a class="reference internal" href="#saturating-arithmetic">Saturating Arithmetic</a></li>
<li><a class="reference internal" href="#dot-product">Dot product</a></li>
<li><a class="reference internal" href="#pseudo-random-numbers">Pseudo-Random Numbers</a></li>
<li><a class="reference internal" href="#random-numbers">Random Numbers</a></li>
</ul>
</li>
<li><a class="reference internal" href="#output-functions">Output Functions</a></li>
<li><a class="reference internal" href="#assertions">Assertions</a></li>
<li><a class="reference internal" href="#compiler-optimization-hints">Compiler Optimization Hints</a></li>
<li><a class="reference internal" href="#cross-program-instance-operations">Cross-Program Instance Operations</a><ul>
<li><a class="reference internal" href="#reductions">Reductions</a></li>
</ul>
</li>
<li><a class="reference internal" href="#stack-memory-allocation">Stack Memory Allocation</a></li>
<li><a class="reference internal" href="#data-movement">Data Movement</a><ul>
<li><a class="reference internal" href="#setting-and-copying-values-in-memory">Setting and Copying Values In Memory</a></li>
<li><a class="reference internal" href="#packed-load-and-store-operations">Packed Load and Store Operations</a></li>
<li><a class="reference internal" href="#streaming-load-and-store-operations">Streaming Load and Store Operations</a></li>
</ul>
</li>
<li><a class="reference internal" href="#data-conversions">Data Conversions</a><ul>
<li><a class="reference internal" href="#converting-between-array-of-structures-and-structure-of-arrays-layout">Converting Between Array-of-Structures and Structure-of-Arrays Layout</a></li>
<li><a class="reference internal" href="#conversions-to-and-from-half-precision-floats">Conversions To and From Half-Precision Floats</a></li>
<li><a class="reference internal" href="#converting-to-srgb8">Converting to sRGB8</a></li>
</ul>
</li>
<li><a class="reference internal" href="#systems-programming-support">Systems Programming Support</a><ul>
<li><a class="reference internal" href="#atomic-operations-and-memory-fences">Atomic Operations and Memory Fences</a></li>
<li><a class="reference internal" href="#prefetches">Prefetches</a></li>
<li><a class="reference internal" href="#system-information">System Information</a></li>
</ul>
</li>
</ul>
</li>
<li><a class="reference internal" href="#interoperability-with-the-application">Interoperability with the Application</a><ul>
<li><a class="reference internal" href="#interoperability-overview">Interoperability Overview</a></li>
<li><a class="reference internal" href="#data-layout">Data Layout</a></li>
<li><a class="reference internal" href="#data-alignment-and-aliasing">Data Alignment and Aliasing</a></li>
<li><a class="reference internal" href="#restructuring-existing-programs-to-use-ispc">Restructuring Existing Programs to Use ISPC</a></li>
</ul>
</li>
<li><a class="reference internal" href="#notices-disclaimers">Notices & Disclaimers</a></li>
</ul>
<div class="section" id="recent-changes-to-ispc">
<h1>Recent Changes to ISPC</h1>
<p>See the file <a class="reference external" href="https://raw.github.com/ispc/ispc/main/docs/ReleaseNotes.txt">ReleaseNotes.txt</a> in the <tt class="docutils literal">ispc</tt> distribution for a list
of recent changes to the compiler.</p>
<div class="section" id="updating-ispc-programs-for-changes-in-ispc-1-1">
<h2>Updating ISPC Programs For Changes In ISPC 1.1</h2>
<p>The major changes introduced in the 1.1 release of <tt class="docutils literal">ispc</tt> are first-class
support for pointers in the language and new parallel loop constructs.
Adding this functionality required a number of syntactic changes to the
language. These changes should generally lead to straightforward minor
modifications of existing <tt class="docutils literal">ispc</tt> programs.</p>
<p>These are the relevant changes to the language:</p>
<ul class="simple">
<li>The syntax for reference types has been changed to match C++'s syntax for
references and the <tt class="docutils literal">reference</tt> keyword has been removed. (A diagnostic
message is issued if <tt class="docutils literal">reference</tt> is used.)<ul>
<li>Declarations like <tt class="docutils literal">reference float foo</tt> should be changed to <tt class="docutils literal">float &foo</tt>.</li>
<li>Any array parameters in function declaration with a <tt class="docutils literal">reference</tt>
qualifier should just have <tt class="docutils literal">reference</tt> removed: <tt class="docutils literal">void foo(reference
float <span class="pre">bar[])</span></tt> can just be <tt class="docutils literal">void foo(float <span class="pre">bar[])</span></tt>.</li>
</ul>
</li>
<li>It is now a compile-time error to assign an entire array to another
array.</li>
<li>A number of standard library routines have been updated to take
pointer-typed parameters, rather than references or arrays an index
offsets, as appropriate. For example, the <tt class="docutils literal">atomic_add_global()</tt>
function previously took a reference to the variable to be updated
atomically but now takes a pointer. In a similar fashion,
<tt class="docutils literal">packed_store_active()</tt> takes a pointer to a <tt class="docutils literal">uniform unsigned int</tt>
as its first parameter rather than taking a <tt class="docutils literal">uniform unsigned int[]</tt> as
its first parameter and a <tt class="docutils literal">uniform int</tt> offset as its second parameter.</li>
<li>It is no longer legal to pass a varying lvalue to a function that takes a
reference parameter; references can only be to uniform lvalue types. In
this case, the function should be rewritten to take a varying pointer
parameter.</li>
<li>There are new iteration constructs for looping over computation domains,
<tt class="docutils literal">foreach</tt> and <tt class="docutils literal">foreach_tiled</tt>. In addition to being syntactically
cleaner than regular <tt class="docutils literal">for</tt> loops, these can provide performance
benefits in many cases when iterating over data and mapping it to program
instances. See the Section <a class="reference internal" href="#parallel-iteration-statements-foreach-and-foreach-tiled">Parallel Iteration Statements: "foreach" and
"foreach_tiled"</a> for more information about these.</li>
</ul>
</div>
<div class="section" id="updating-ispc-programs-for-changes-in-ispc-1-2">
<h2>Updating ISPC Programs For Changes In ISPC 1.2</h2>
<p>The following changes were made to the language syntax and semantics for
the <tt class="docutils literal">ispc</tt> 1.2 release:</p>
<ul class="simple">
<li>Syntax for the "launch" keyword has been cleaned up; it's now no longer
necessary to bracket the launched function call with angle brackets. (In
other words, now use <tt class="docutils literal">launch <span class="pre">foo();</span></tt>, rather than <tt class="docutils literal">launch < foo() >;</tt>.)</li>
<li>When using pointers, the pointed-to data type is now "uniform" by
default. Use the varying keyword to specify varying pointed-to types
when needed. (i.e. <tt class="docutils literal">float *ptr</tt> is a varying pointer to uniform float
data, whereas previously it was a varying pointer to varying float
values.) Use <tt class="docutils literal">varying float *</tt> to specify a varying pointer to varying
float data, and so forth.</li>
<li>The details of "uniform" and "varying" and how they interact with struct
types have been cleaned up. Now, when a struct type is declared, if the
struct elements don't have explicit "uniform" or "varying" qualifiers,
they are said to have "unbound" variability. When a struct type is
instantiated, any unbound variability elements inherit the variability of
the parent struct type. See <a class="reference internal" href="#struct-types">Struct Types</a> for more details.</li>
<li><tt class="docutils literal">ispc</tt> has a new language feature that makes it much easier to use the
efficient "(array of) structure of arrays" (AoSoA, or SoA) memory layout
of data. A new <tt class="docutils literal">soa<n></tt> qualifier can be applied to structure types to
specify an n-wide SoA version of the corresponding type. Array indexing
and pointer operations with arrays SoA types automatically handles the
two-stage indexing calculation to access the data. See <a class="reference internal" href="#structure-of-array-types">Structure of
Array Types</a> for more details.</li>
</ul>
</div>
<div class="section" id="updating-ispc-programs-for-changes-in-ispc-1-3">
<h2>Updating ISPC Programs For Changes In ISPC 1.3</h2>
<p>This release adds a number of new iteration constructs, which in turn use
new reserved words: <tt class="docutils literal">unmasked</tt>, <tt class="docutils literal">foreach_unique</tt>, <tt class="docutils literal">foreach_active</tt>,
and <tt class="docutils literal">in</tt>. Any program that happens to have a variable or function with
one of these names must be modified to rename that symbol.</p>
</div>
<div class="section" id="updating-ispc-programs-for-changes-in-ispc-1-5-0">
<h2>Updating ISPC Programs For Changes In ISPC 1.5.0</h2>
<p>This release adds support for double precision floating point constants.
Double precision floating point constants are floating point number with
<tt class="docutils literal">d</tt> suffix and optional exponent part. Here are some examples: 3.14d,
31.4d-1, 1.d, 1.0d, 1d-2. Note that floating point number without suffix is
treated as single precision constant.</p>
</div>
<div class="section" id="updating-ispc-programs-for-changes-in-ispc-1-6-0">
<h2>Updating ISPC Programs For Changes In ISPC 1.6.0</h2>
<p>This release adds support for <a class="reference internal" href="#operators-overloading">Operators Overloading</a>, so a word <tt class="docutils literal">operator</tt>
becomes a keyword and it potentially creates a conflict with existing user
function. Also a new library function packed_store_active2() was introduced,
which also may create a conflict with existing user functions.</p>
</div>
<div class="section" id="updating-ispc-programs-for-changes-in-ispc-1-7-0">
<h2>Updating ISPC Programs For Changes In ISPC 1.7.0</h2>
<p>This release contains several changes that may affect compatibility with
older versions:</p>
<ul class="simple">
<li>The algorithm for selecting overloaded functions was extended to cover more
types of overloading, and handling of reference types was fixed. At the same
time the old scheme, which blindly used the function with "the best score"
summed for all arguments, was switched to the C++ approach, which requires
"the best score" for each argument. If the best function doesn't exist, a
warning is issued in this version. It will be turned into an error in the
next version. A simple example: Suppose we have two functions: max(int, int)
and max(unsigned int, unsigned int). The new rules lead to an error when
calling max(int, unsigned int), as the best choice is ambiguous.</li>
<li>Implicit cast of pointer to const type to void* was disallowed. Use explicit
cast if needed.</li>
<li>A bug which prevented "const" qualifiers from appearing in emitted .h files
was fixed. Consequently, "const" qualifiers now properly appearing in emitted
.h files may cause compile errors in pre-existing codes.</li>
<li>get_ProgramCount() was moved from stdlib to examples/util/util.isph file. You
need to include this file to be able to use this function.</li>
</ul>
</div>
<div class="section" id="updating-ispc-programs-for-changes-in-ispc-1-8-2">
<h2>Updating ISPC Programs For Changes In ISPC 1.8.2</h2>
<p>The release doesn't contain language changes, which may affect compatibility with
older versions. Though you may want be aware of the following:</p>
<ul class="simple">
<li>Mangling of uniform types was changed to not include varying width, so now you
may use uniform structures and pointers to uniform types as return types in
export functions in multi-target compilation.</li>
</ul>
</div>
<div class="section" id="updating-ispc-programs-for-changes-in-ispc-1-9-0">
<h2>Updating ISPC Programs For Changes In ISPC 1.9.0</h2>
<p>The release doesn't contains language changes, which may affect compatibility with
older versions. It introduces new AVX512 target: avx512knl-i32x16.</p>
</div>
<div class="section" id="updating-ispc-programs-for-changes-in-ispc-1-9-1">
<h2>Updating ISPC Programs For Changes In ISPC 1.9.1</h2>
<p>The release doesn't contains language changes, which may affect compatibility with
older versions. It introduces new AVX512 target: avx512skx-i32x16.</p>
</div>
<div class="section" id="updating-ispc-programs-for-changes-in-ispc-1-9-2">
<h2>Updating ISPC Programs For Changes In ISPC 1.9.2</h2>
<p>The release doesn't contain language changes, which may affect compatibility with
older versions.</p>
</div>
<div class="section" id="updating-ispc-programs-for-changes-in-ispc-1-10-0">
<h2>Updating ISPC Programs For Changes In ISPC 1.10.0</h2>
<p>The release has several new language features, which do not affect compatibility.
Namely, new streaming stores, aos_to_soa/soa_to_aos intrinsics for 64 bit types,
and a "#pragma ignore".</p>
<p>One change that potentially may affect compatibility - changed size of short vector
types. If you use short vector types for data passed between C/C++ and ISPC, you
may want to pay attention to it.</p>
</div>
<div class="section" id="updating-ispc-programs-for-changes-in-ispc-1-11-0">
<h2>Updating ISPC Programs For Changes In ISPC 1.11.0</h2>
<p>This release redefined -O1 compiler option to optimize for size, so it may require
adjusting your build system accordingly.</p>
<p>Starting 1.11.0 version auto-generated headers use <tt class="docutils literal">#pragma once</tt>. In the unlikely
case when your C/C++ compiler is not supporting that, please use <tt class="docutils literal"><span class="pre">--no-pragma-once</span></tt>
<tt class="docutils literal">ispc</tt> switch.</p>
<p>This release also introduces new AVX512 target avx512skx-i32x8. It produces code,
which doesn't use ZMM registers.</p>
</div>
<div class="section" id="updating-ispc-programs-for-changes-in-ispc-1-12-0">
<h2>Updating ISPC Programs For Changes In ISPC 1.12.0</h2>
<p>This release contains the following changes that may affect compatibility with
older versions:</p>
<ul class="simple">
<li><tt class="docutils literal">noinline</tt> keyword was added.</li>
<li>Standard library functions <tt class="docutils literal">rsqrt_fast()</tt> and <tt class="docutils literal">rcp_fast()</tt> were added.</li>
<li>AVX1.1 (IvyBridge) targets and generic KNC and KNL targets were removed.
Note that KNL is still supported through avx512knl-i32x16.</li>
</ul>
<p>The release also introduces static initialization for varying variables, which
should not affect compatibility.</p>
<p>This release introduces experimental cross OS compilation support and ARM/AARCH64
support. It also contains a new 128-bit AVX2 target (avx2-i32x4) and a CPU
definition for Ice Lake client (--device=icl).</p>
</div>
<div class="section" id="updating-ispc-programs-for-changes-in-ispc-1-13-0">
<h2>Updating ISPC Programs For Changes In ISPC 1.13.0</h2>
<p>This release contains the following changes that may affect compatibility with
older versions:</p>
<ul class="simple">
<li>Representation of <tt class="docutils literal">bool</tt> type in storage was changed from target-specific to
one byte per boolean value. So size of <tt class="docutils literal">varying bool</tt> is target width (in
bytes), and size of <tt class="docutils literal">uniform bool</tt> is one. This definition is compatible
with C/C++, hence improves interoperability.</li>
<li>type aliases for unsigned types were added: <tt class="docutils literal">uint8</tt>, <tt class="docutils literal">uint16</tt>, <tt class="docutils literal">uint32</tt>,
<tt class="docutils literal">uint64</tt>, and <tt class="docutils literal">uint</tt>. To detect if these types are supported you can
check if ISPC_UINT_IS_DEFINED macro is defined, this is handy for writing code
which works with older versions of <tt class="docutils literal">ispc</tt>.</li>
<li><tt class="docutils literal">extract()</tt>/<tt class="docutils literal">insert()</tt> for boolean arguments, and <tt class="docutils literal">abs()</tt> for all integer and
FP types were added to standard library.</li>
</ul>
</div>
<div class="section" id="updating-ispc-programs-for-changes-in-ispc-1-14-0">
<h2>Updating ISPC Programs For Changes In ISPC 1.14.0</h2>
<p>This release contains the following changes that may affect compatibility with
older versions:</p>
<ul class="simple">
<li>"generic" targets were removed. Please use native targets instead.</li>
</ul>
<p>New i8 and i16 targets were introduced: avx2-i8x32, avx2-i16x16, avx512skx-i8x64,
and avx512skx-i16x32.</p>
<p>Windows x86_64 target now supports <tt class="docutils literal">__vectorcall</tt> calling convention.
It's off by default, can be enabled by <tt class="docutils literal"><span class="pre">--vectorcall</span></tt> command line switch.</p>
</div>
<div class="section" id="updating-ispc-programs-for-changes-in-ispc-1-14-1">
<h2>Updating ISPC Programs For Changes In ISPC 1.14.1</h2>
<p>The release doesn't contain language changes, which may affect compatibility with
older versions.</p>
</div>
<div class="section" id="updating-ispc-programs-for-changes-in-ispc-1-15-0">
<h2>Updating ISPC Programs For Changes In ISPC 1.15.0</h2>
<p>The release has several new language features, which do not affect compatibility.
Namely, packed_[load|store]_active() stdlib functions for 64 bit types, and loop
unroll pragmas: "#pragma unroll" and "#pragma nounroll".</p>
</div>
<div class="section" id="updating-ispc-programs-for-changes-in-ispc-1-16-0">
<h2>Updating ISPC Programs For Changes In ISPC 1.16.0</h2>
<p>The release has several new functions in the standard library, that can possibly
affect compatibility:</p>
<ul class="simple">
<li><tt class="docutils literal">alloca()</tt> - refer to <a class="reference internal" href="#stack-memory-allocation">Stack Memory Allocation</a> for more details.</li>
<li><tt class="docutils literal">assume()</tt> - refer to <a class="reference internal" href="#compiler-optimization-hints">Compiler Optimization Hints</a> for more details.</li>
<li><tt class="docutils literal">trunc()</tt> - refer to <a class="reference internal" href="#basic-math-functions">Basic Math Functions</a> for more details.</li>
</ul>
<p>The language got experimental feature for calling LLVM intrinsics. This
should not affect compatibility with existing programs.
See <a class="reference internal" href="#llvm-intrinsic-functions">LLVM Intrinsic Functions</a> for more details.</p>
</div>
<div class="section" id="updating-ispc-programs-for-changes-in-ispc-1-17-0">
<h2>Updating ISPC Programs For Changes In ISPC 1.17.0</h2>
<p>The release introduces new data type <tt class="docutils literal">float16</tt> and floating point literals
with <tt class="docutils literal">f16</tt> suffix.</p>
<p>For the sake of unification with C/C++, capital letter X may be used in
hexadecimal prefix (<tt class="docutils literal">0X</tt>) and capital letter P as a separator for exponent in
hexadecimal floating point. For example: <tt class="docutils literal">0X1P16</tt>.</p>
<p>The naming of Xe targets, architectures, device names has changed.</p>
<p>Standard library library got new <tt class="docutils literal"><span class="pre">prefetchw_{l1,l2,l3}()</span></tt> intrinsics for
prefetching in anticipation of write.</p>
<p>The algorithms used for implementation of <tt class="docutils literal">rsqrt(double)</tt> and <tt class="docutils literal">rcp(double)</tt>
standard library functions have changed on AVX512 and may affect the existing
code.</p>
</div>
<div class="section" id="updating-ispc-programs-for-changes-in-ispc-1-18-0">
<h2>Updating ISPC Programs For Changes In ISPC 1.18.0</h2>
<p>AVX512 targets were renamed to drop "base type" (or "mask size"), old naming is accepted for
compatibility. New names are avx512skx-x4, avx512skx-x8, avx512skx-x16,
avx512skx-x32, avx512skx-x64, and avx512knl-x16.</p>
<p>Standard library gained full support for <tt class="docutils literal">float16</tt> type. Note that it is
fully supported only on the targets with native hardware support.
On the other targets emulation is still not guaranteed, but may work in some cases.</p>
<p>The compiler gained support for <tt class="docutils literal"><span class="pre">-E</span></tt> switch for running preprocessor only,
which is similar to the switch of C/C++ compilers. Also, as a result of bug fix,
in case of preprocessor error, the compiler will crash now. It used not to crash and
produced some output (sometimes correct!). As it was a convenient feature for some
users running experiments in isolated environment (like ignoring missing includes
when compiling of <a class="reference external" href="https://godbolt.org/">Compiler Explorer</a>), <tt class="docutils literal"><span class="pre">--ignore-preprocessor-errors</span></tt> switch
was added to preserve this behavior.</p>
</div>
<div class="section" id="updating-ispc-programs-for-changes-in-ispc-1-19-0">
<h2>Updating ISPC Programs For Changes In ISPC 1.19.0</h2>
<p>New targets were added:</p>
<ul class="simple">
<li>avx512spr-x4, avx512spr-x8, avx512spr-x16, avx512spr-x32, avx512spr-x64 for
4th generation Intel® Xeon® Scalable (codename Sapphire Rapids) CPUs. A macro
ISPC_TARGET_AVX512SPR was added.</li>
<li>xehpc-x16 and xehpc-x32 for Intel® Data Center GPU Max (codename Ponte Vecchio).</li>
</ul>
<p>Function templates were introduces to the language, please refer to <a class="reference internal" href="#function-templates">Function
Templates</a> section for more details. Two new keywords were introduced: <tt class="docutils literal">template</tt>
and <tt class="docutils literal">typename</tt>.</p>
<p><tt class="docutils literal">ISPC_FP16_SUPPORTED</tt> macro was introduced for the targets supporting FP16.</p>
</div>
<div class="section" id="updating-ispc-programs-for-changes-in-ispc-1-20-0">
<h2>Updating ISPC Programs For Changes In ISPC 1.20.0</h2>
<p>New version of <cite>sse4</cite> targets were added, now you can specify either <cite>sse4.1</cite>
or <cite>sse4.2</cite>, for example <cite>sse4.2-i32x4</cite>. The changes are fully backward
compatible, meaning that <cite>sse4</cite> versions are still accepted and aliased to
<cite>sse4.2</cite>. Multi-target compilation accepts only one of <cite>sse4</cite>/<cite>sse4.1</cite>/<cite>sse4.2</cite>
targets. All of these targets will produce an object file with <cite>sse4</cite> suffix in
multi-target compilation.</p>
</div>
<div class="section" id="updating-ispc-programs-for-changes-in-ispc-1-21-0">
<h2>Updating ISPC Programs For Changes In ISPC 1.21.0</h2>
<p>Now, in case of signed integer overflow, <cite>ispc</cite> will assume undefined behavior similar to
C and C++. This change may cause compatibility issues. You can manage this behavior by
using the <cite>--[no-]wrap-signed-int</cite> compiler switch. The default behavior (before version
1.21.0) can be preserved by using <cite>--wrap-signed-int</cite>, which maintains defined wraparound
behavior for signed integers, though it may limit some compiler optimizations.</p>
<p>Template function specializations with explicit template arguments were introduced to the
language, please refer to <a class="reference internal" href="#function-templates">Function Templates</a> section for more details.</p>
</div>
<div class="section" id="updating-ispc-programs-for-changes-in-ispc-1-22-0">
<h2>Updating ISPC Programs For Changes In ISPC 1.22.0</h2>
<p>Template operators with explicit specializations and instantiations were introduced to
the language. The usage of different function specifiers with templates were fixed and
aligned, please refer to <a class="reference internal" href="#function-templates">Function Templates</a> section for more details.</p>
<p>Now, command-line switch <cite>--dwarf-version=<n></cite> forces DWARF format debug info generation
on Windows. It allows to debug ISPC code linked with MinGW generated code.</p>
</div>
<div class="section" id="updating-ispc-programs-for-changes-in-ispc-1-23-0">
<h2>Updating ISPC Programs For Changes In ISPC 1.23.0</h2>
<p>This release contains the following changes that may affect compatibility with
older versions:</p>
<ul class="simple">
<li><cite>true</cite> <cite>bool</cite> values in storage were changed from <cite>-1</cite> to <cite>1</cite> to match C/C++ ABI.
Previously, ISPC treated <cite>bool</cite> values similarly to C/C++ in terms of size, but
incorrectly interpreted their actual values. This meant that <cite>true</cite> in ISPC
might not have translated correctly to true in C/C++. This issue was introduced
in version 1.13.0. Starting now, ISPC correctly stores and interprets <cite>true</cite>
values in a way that aligns with C/C++ expectations.</li>
</ul>
<p>A couple of improvements have been made to variables initialization:</p>
<ul class="simple">
<li>Variables with const qualifiers can be initialized using the values of
previously initialized const variables including arithmetic operations above
them. It now works also with varying types.</li>
<li>Enumeration type values can be used as constants.</li>
</ul>
<p>The result of selection operator can now be used as lvalue if it has suitable
type.</p>
</div>
<div class="section" id="updating-ispc-programs-for-changes-in-ispc-1-24-0">
<h2>Updating ISPC Programs For Changes In ISPC 1.24.0</h2>
<p>This release extends the standard library with new functions performing dot
product operations. These functions utilize specific hardware instructions from
AVX-VNNI and AVX512-VNNI. The ISPC targets that support native VNNI
instructions are <tt class="docutils literal"><span class="pre">avx2vnni-i32x*</span></tt>, <tt class="docutils literal"><span class="pre">avx512icl-*</span></tt> and <tt class="docutils literal"><span class="pre">avx512spr-*</span></tt>. The
first two targets (<tt class="docutils literal"><span class="pre">avx2vnni-*</span></tt> and <tt class="docutils literal"><span class="pre">avx512icl-*</span></tt>) were introduced in this
release. Please refer to <a class="reference internal" href="#dot-product">Dot product</a> for more details.</p>
<p>Now, uniform integers and enums can be used as non-type template parameters.
Please refer to <a class="reference internal" href="#function-templates">Function Templates</a> for more details.</p>
<p>The release contains the following changes that may affect compatibility with
older versions:</p>
<ul class="simple">
<li><tt class="docutils literal"><span class="pre">--pic</span></tt> command line flag now corresponds to the <tt class="docutils literal"><span class="pre">-fpic</span></tt> flag of Clang
and GCC, whereas the newly introduced <tt class="docutils literal"><span class="pre">--PIC</span></tt> corresponds to <tt class="docutils literal"><span class="pre">-fPIC</span></tt>.
The previous behavior of <tt class="docutils literal"><span class="pre">--pic</span></tt> flag corresponded to <tt class="docutils literal"><span class="pre">-fPIC</span></tt> flag. In
some cases, to preserve previous behavior, users may need to switch to
<tt class="docutils literal"><span class="pre">--PIC</span></tt>.</li>
<li>Newly introduced macro definitions for numeric limits can cause conflicts
with user-defined macros with same names. When this happens, ISPC emits
warnings about macro redefinition. Please, refer to <a class="reference internal" href="#the-preprocessor">The Preprocessor</a> for
the full list of macro definitions.</li>
<li>The implementation of <tt class="docutils literal">round</tt> standard library function was aligned across
all targets. It may potentially affect the results of the code that uses this
function for the following targets: <tt class="docutils literal"><span class="pre">avx2-i16x16</span></tt>, <tt class="docutils literal"><span class="pre">avx2-i8x32</span></tt> and all
<tt class="docutils literal">avx512</tt> targets. Please, refer to <a class="reference internal" href="#basic-math-functions">Basic Math Functions</a> for more details.</li>
</ul>
</div>
<div class="section" id="updating-ispc-programs-for-changes-in-ispc-1-25-0">
<h2>Updating ISPC Programs For Changes In ISPC 1.25.0</h2>
<p>The ISPC language has been extended to support the <tt class="docutils literal"><span class="pre">__attribute__(())</span></tt> syntax
for variable and function declarations. The following attributes are now
supported: <tt class="docutils literal">noescape</tt>, <tt class="docutils literal">address_space(N)</tt>, <tt class="docutils literal">external_only</tt>, and
<tt class="docutils literal">unmangled</tt>. The macro <tt class="docutils literal">ISPC_ATTRIBUTE_SUPPORTED</tt> is defined if the ISPC
compiler supports attribute syntax. Please refer to the <a class="reference internal" href="#attributes">Attributes</a> section
for more details and the full list of supported attributes.</p>
<p>This release introduces support for the <tt class="docutils literal"><span class="pre">-ffunction-sections</span></tt> command-line
flag, which generates each function in a separate section. This flag is useful
for reducing the size of the final executable by removing unused functions.
Please refer to the <a class="reference internal" href="#basic-command-line-options">Basic Command-line Options</a> section for more details.</p>
<p>In some cases, such as shared libraries, the <tt class="docutils literal"><span class="pre">-ffunction-sections</span></tt> flag alone
may not be sufficient to remove unused ISPC copies of exported functions. To
address this, you can use the <tt class="docutils literal">external_only</tt> function attribute. This
attribute can only be applied to exported functions and instructs the compiler
to remove the ISPC version of the function. For more information, please refer
to the <a class="reference internal" href="#attributes">Attributes</a> and <a class="reference internal" href="#functions-and-function-calls">Functions and Function Calls</a> sections.</p>
<p>Template support for short vectors and array declarations has been extended.
You can now use both type and non-type parameters to specify the type and
dimensions of these types.</p>
<p>For ARM targets, IEEE 754-compliant instructions (<tt class="docutils literal">fminnm</tt> and <tt class="docutils literal">vminnm</tt>) are
now generated for min/max operations, replacing the previous use of <tt class="docutils literal">fmin</tt> and
<tt class="docutils literal">vmin</tt>.</p>
<p>The <tt class="docutils literal"><span class="pre">avx512knl-x16</span></tt>, <tt class="docutils literal"><span class="pre">gen9-x8</span></tt>, and <tt class="docutils literal"><span class="pre">gen9-x16</span></tt> targets are deprecated and
will be removed in future releases.</p>
</div>
</div>
<div class="section" id="getting-started-with-ispc">
<h1>Getting Started with ISPC</h1>
<div class="section" id="installing-ispc">
<h2>Installing ISPC</h2>
<p>The <a class="reference external" href="http://ispc.github.io/downloads.html">ispc downloads web page</a> has prebuilt executables for Windows*,
Linux* and macOS* available for download. Alternatively, you can
download the source code from that page and build it yourself; see the
<a class="reference external" href="http://github.com/ispc/ispc/wiki">ispc wiki</a> for instructions about building <tt class="docutils literal">ispc</tt> from source.</p>
<p>Once you have an executable for your system, copy it into a directory
that's in your <tt class="docutils literal">PATH</tt>. Congratulations--you've now installed <tt class="docutils literal">ispc</tt>.</p>
</div>
<div class="section" id="compiling-and-running-a-simple-ispc-program">
<h2>Compiling and Running a Simple ISPC Program</h2>
<p>The directory <tt class="docutils literal">examples/simple</tt> in the <tt class="docutils literal">ispc</tt> distribution includes a
simple example of how to use <tt class="docutils literal">ispc</tt> with a short C++ program. See the
file <tt class="docutils literal">simple.ispc</tt> in that directory (also reproduced here.)</p>
<pre class="literal-block">
export void simple(uniform float vin[], uniform float vout[],
uniform int count) {
foreach (index = 0 ... count) {
float v = vin[index];
if (v < 3.)
v = v * v;
else
v = sqrt(v);
vout[index] = v;
}
}
</pre>
<p>This program loops over an array of values in <tt class="docutils literal">vin</tt> and computes an
output value for each one. For each value in <tt class="docutils literal">vin</tt>, if its value is less
than three, the output is the value squared, otherwise it's the square root
of the value.</p>
<p>The first thing to notice in this program is the presence of the <tt class="docutils literal">export</tt>
keyword in the function definition; this indicates that the function should
be made available to be called from application code. The <tt class="docutils literal">uniform</tt>
qualifiers on the parameters to <tt class="docutils literal">simple</tt> indicate that the corresponding
variables are non-vector quantities--this concept is discussed in detail in the
<a class="reference internal" href="#uniform-and-varying-qualifiers">"uniform" and "varying" Qualifiers</a> section.</p>
<p>Each iteration of the <tt class="docutils literal">foreach</tt> loop works on a number of input values in
parallel--depending on the compilation target chosen, it may be 4, 8, 16, 32, or
even 64 elements of the <tt class="docutils literal">vin</tt> array, processed efficiently with the CPU's or
GPU's SIMD hardware. Here, the variable <tt class="docutils literal">index</tt> takes all values from 0 to
<tt class="docutils literal"><span class="pre">count-1</span></tt>. After the load from the array to the variable <tt class="docutils literal">v</tt>, the
program can then proceed, doing computation and control flow based on the
values loaded. The result from the running program instances is written to
the <tt class="docutils literal">vout</tt> array before the next iteration of the <tt class="docutils literal">foreach</tt> loop runs.</p>
<p>To build and run examples go to <tt class="docutils literal">examples</tt> and create <tt class="docutils literal">build</tt> folder.
Run <tt class="docutils literal">cmake <span class="pre">-DISPC_EXECUTABLE=<path_to_ispc_binary></span> ../</tt>. On Linux* and
macOS*, the makefile will be generated in that directory. On Windows*,
Microsoft Visual Studio solution <tt class="docutils literal">ispc_examples.sln</tt> will be created. In
either case, build it now! We'll walk through the details of the compilation
steps in the following section, <a class="reference internal" href="#using-the-ispc-compiler">Using The ISPC Compiler</a>.) In addition to
compiling the <tt class="docutils literal">ispc</tt> program, in this case the <tt class="docutils literal">ispc</tt> compiler also
generates a small header file, <tt class="docutils literal">simple.h</tt>. This header file includes the
declaration for the C-callable function that the above <tt class="docutils literal">ispc</tt> program is
compiled to. The relevant parts of this file are:</p>
<pre class="literal-block">
#ifdef __cplusplus
extern "C" {
#endif // __cplusplus
extern void simple(float vin[], float vout[], int32_t count);
#ifdef __cplusplus
}
#endif // __cplusplus
</pre>
<p>It's not mandatory to <tt class="docutils literal">#include</tt> the generated header file in your C/C++
code (you can alternatively use a manually-written <tt class="docutils literal">extern</tt> declaration
of the <tt class="docutils literal">ispc</tt> functions you use), but it's a helpful check to ensure that
the function signatures are as expected on both sides.</p>
<p>Here is the main program, <tt class="docutils literal">simple.cpp</tt>, which calls the <tt class="docutils literal">ispc</tt> function
above.</p>
<pre class="literal-block">
#include <stdio.h>
#include "simple.h"
int main() {
float vin[16], vout[16];
for (int i = 0; i < 16; ++i)
vin[i] = i;
simple(vin, vout, 16);
for (int i = 0; i < 16; ++i)
printf("%d: simple(%f) = %f\n", i, vin[i], vout[i]);
}
</pre>
<p>Note that the call to the <tt class="docutils literal">ispc</tt> function in the middle of <tt class="docutils literal">main()</tt> is
a regular function call. (And it has the same overhead as a C/C++ function
call, for that matter.)</p>
<p>When the executable <tt class="docutils literal">simple</tt> runs, it generates the expected output:</p>
<pre class="literal-block">
0: simple(0.000000) = 0.000000
1: simple(1.000000) = 1.000000
2: simple(2.000000) = 4.000000
3: simple(3.000000) = 1.732051
...
</pre>
<p>For a slightly more complex example of using <tt class="docutils literal">ispc</tt>, see the <a class="reference external" href="http://ispc.github.io/example.html">Mandelbrot
set example</a> page on the <tt class="docutils literal">ispc</tt> website for a walk-through of an <tt class="docutils literal">ispc</tt>
implementation of that algorithm. After reading through that example, you
may want to examine the source code of the various examples in the
<tt class="docutils literal">examples/</tt> directory of the <tt class="docutils literal">ispc</tt> distribution.</p>
</div>
</div>
<div class="section" id="using-the-ispc-compiler">
<h1>Using The ISPC Compiler</h1>
<p>To go from an <tt class="docutils literal">ispc</tt> source file to an object file that can be linked
with application code, enter the following command</p>
<pre class="literal-block">
ispc foo.ispc -o foo.o
</pre>
<p>(On Windows, you may want to specify <tt class="docutils literal">foo.obj</tt> as the output filename.)</p>
<div class="section" id="basic-command-line-options">
<h2>Basic Command-line Options</h2>
<p>The <tt class="docutils literal">ispc</tt> executable can be run with <tt class="docutils literal"><span class="pre">--help</span></tt> to print a list of
accepted command-line arguments. By default, the compiler compiles the
provided program (and issues warnings and errors), but doesn't
generate any output.</p>
<p>If the <tt class="docutils literal"><span class="pre">-o</span></tt> flag is given, it will generate an output file (a native
object file by default).</p>
<pre class="literal-block">
ispc foo.ispc -o foo.obj
</pre>
<p>To generate a text assembly file, pass <tt class="docutils literal"><span class="pre">--emit-asm</span></tt>:</p>
<pre class="literal-block">
ispc foo.ispc -o foo.s --emit-asm
</pre>
<p>To generate LLVM bitcode, use the <tt class="docutils literal"><span class="pre">--emit-llvm</span></tt> flag.
To generate LLVM bitcode in textual form, use the <tt class="docutils literal"><span class="pre">--emit-llvm-text</span></tt> flag.</p>
<p>To run only the preprocessor, use the <tt class="docutils literal"><span class="pre">-E</span></tt> flag.</p>
<pre class="literal-block">
ispc foo.ispc -E -o foo.i
ispc foo.ispc -E -o foo.ispi
</pre>
<p>In this mode, the output will be directed to <tt class="docutils literal">stdout</tt> if no output file is
specified. The standard suffixes <tt class="docutils literal">.i</tt> or <tt class="docutils literal">.ispi</tt> are assumed for preprocessor output.</p>
<p>By default the compilation will fail if preprocessor encountered an error.
To ignore the preprocessor errors and proceed with normal compilation flow,
<tt class="docutils literal"><span class="pre">--ignore-preprocessor-errors</span></tt> switch may be used.</p>
<p>Optimizations are on by default; they can be turned off with <tt class="docutils literal"><span class="pre">-O0</span></tt>:</p>
<pre class="literal-block">
ispc foo.ispc -o foo.obj -O0
</pre>
<p>There is support for generating debugging symbols; this is enabled with the
<tt class="docutils literal"><span class="pre">-g</span></tt> command-line flag. Using <tt class="docutils literal"><span class="pre">-g</span></tt> doesn't affect optimization level;
to debug unoptimized code pass <tt class="docutils literal"><span class="pre">-O0</span></tt> flag.</p>
<p>The <tt class="docutils literal"><span class="pre">-h</span></tt> flag can also be used to direct <tt class="docutils literal">ispc</tt> to generate a C/C++
header file that includes C/C++ declarations of the C-callable <tt class="docutils literal">ispc</tt>
functions and the types passed to it.</p>
<p>The <tt class="docutils literal"><span class="pre">-D</span></tt> option can be used to specify definitions to be passed along to
the pre-processor, which runs over the program input before it's compiled.
For example, including <tt class="docutils literal"><span class="pre">-DTEST=1</span></tt> defines the pre-processor symbol
<tt class="docutils literal">TEST</tt> to have the value <tt class="docutils literal">1</tt> when the program is compiled.</p>
<p>The compiler issues a number of performance warnings for code constructs
that compile to relatively inefficient code. These warnings can be
silenced with the <tt class="docutils literal"><span class="pre">--wno-perf</span></tt> flag (or by using <tt class="docutils literal"><span class="pre">--woff</span></tt>, which turns
off all compiler warnings.) Furthermore, <tt class="docutils literal"><span class="pre">--werror</span></tt> can be provided to
direct the compiler to treat any warnings as errors.</p>
<p>The <tt class="docutils literal"><span class="pre">--pic</span></tt> flag can be used to generate position-independent code suitable
for use in a shared library. The <tt class="docutils literal"><span class="pre">--PIC</span></tt> flag can be used to generate
position-independent code suitable for dynamic linking avoiding any limit on
the size of the global offset table. When no <tt class="docutils literal"><span class="pre">--pic</span></tt> or <tt class="docutils literal"><span class="pre">--PIC</span></tt> flag is
provided, the compiler enforces target-specific default behavior.</p>
<p>The <tt class="docutils literal"><span class="pre">-ffunction-sections</span></tt> flag can be used to generate each function in a
separate section. This flag is useful for reducing the size of the final
executable by removing unused functions when it is combined with linker flag
that removes unused sections: <tt class="docutils literal"><span class="pre">--gc-sections</span></tt> for <tt class="docutils literal">GNU ld</tt> and <tt class="docutils literal">/OPT:REF</tt>
for <tt class="docutils literal">MSVC link.exe</tt>. On macOS, this flag does not have any effect (as in
clang) because dead stripping <tt class="docutils literal"><span class="pre">-dead_strip</span></tt> for <tt class="docutils literal">ld64</tt> works differently.
The <tt class="docutils literal"><span class="pre">-fno-function-sections</span></tt> disables this behavior.</p>
</div>
<div class="section" id="selecting-the-compilation-target">
<h2>Selecting The Compilation Target</h2>
<p>There are four options that affect the compilation target: <tt class="docutils literal"><span class="pre">--arch</span></tt>,
which sets the target architecture, <tt class="docutils literal"><span class="pre">--device</span></tt> (also may be spelled as <tt class="docutils literal"><span class="pre">--cpu</span></tt>),
which sets the target CPU or GPU, <tt class="docutils literal"><span class="pre">--target</span></tt>, which sets the target instruction
set, and <tt class="docutils literal"><span class="pre">--target-os</span></tt>, which sets the target operating system.</p>
<p>If none of these options is specified, <tt class="docutils literal">ispc</tt> generates code for the host
OS and for the architecture of the system the compiler is running on (i.e.
64-bit x86-64 (<tt class="docutils literal"><span class="pre">--arch=x86-64</span></tt>) on x86 systems and ARM NEON on ARM systems.</p>
<p>To compile to a 32-bit x86 target, for example, supply <tt class="docutils literal"><span class="pre">--arch=x86</span></tt> on
the command line:</p>
<pre class="literal-block">
ispc foo.ispc -o foo.obj --arch=x86
</pre>
<p>To compile for Intel Xe LP platform:</p>
<pre class="literal-block">
ispc foo.ispc -o foo.bin --target=xelp-x16 --device=tgllp --emit-zebin
</pre>
<p>Currently-supported architectures are <tt class="docutils literal">x86</tt>, <tt class="docutils literal"><span class="pre">x86-64</span></tt>, <tt class="docutils literal">xe64</tt>,
<tt class="docutils literal">arm</tt>, and <tt class="docutils literal">aarch64</tt>.</p>
<p>The target CPU determines both the default instruction set used as well as
which CPU architecture the code is tuned for. <tt class="docutils literal">ispc <span class="pre">--help</span></tt> provides a
list of all of the supported CPUs. By default, the CPU type of the system
on which you're running <tt class="docutils literal">ispc</tt> is used to determine the target CPU.</p>
<pre class="literal-block">
ispc foo.ispc -o foo.obj --device=corei7-avx
</pre>
<p>Next, <tt class="docutils literal"><span class="pre">--target</span></tt> selects the target instruction set. For targets without
hardware support for masking, the target string is of the form <tt class="docutils literal"><span class="pre">[ISA]-i[mask</span> size]x[gang size]</tt>.
For example, <tt class="docutils literal"><span class="pre">--target=avx2-i32x16</span></tt> specifies a target with the AVX2 instruction set,
a mask size of 32 bits, and a gang size of 16. For targets with hardware masking support,
which are AVX512 and GPU targets, the target string is of the form
<tt class="docutils literal"><span class="pre">[ISA]-x[gang</span> size]</tt>. For example, <tt class="docutils literal"><span class="pre">--target=xehpg-x16</span></tt> specifies Intel XeHPG
as a target ISA and defines a gang size of 16.</p>
<p>By default, the target instruction set is chosen based on the most capable
one supported by the system on which you're running <tt class="docutils literal">ispc</tt>. In this case a warning
will be issued noting the target used for compilation. It is recommended to
always use <tt class="docutils literal"><span class="pre">--target</span></tt> switch to explicitly specify the target.</p>
<p>To get the complete list of supported targets, please use <tt class="docutils literal"><span class="pre">--help</span></tt> switch
and note the list in the description of <tt class="docutils literal"><span class="pre">--target</span></tt>, or use <tt class="docutils literal"><span class="pre">--support-matrix</span></tt>
switch, which will give the complete information of supported combinations
of target, arch and target OS.</p>
<p>The following target ISAs are supported:</p>
<table border="1" class="docutils">
<colgroup>
<col width="14%" />
<col width="86%" />
</colgroup>
<tbody valign="top">
<tr><td>Target</td>
<td>Description</td>
</tr>
<tr><td>avx, avx1</td>
<td>AVX (2010-2011 era Intel CPUs)</td>
</tr>
<tr><td>avx2</td>
<td>AVX 2 target (2013- Intel codename Haswell CPUs)</td>
</tr>
<tr><td>avx512knl</td>
<td>AVX 512 target (Xeon Phi chips codename Knights Landing)</td>
</tr>
<tr><td>avx512skx</td>
<td>AVX 512 target (Skylake Xeon CPUs)</td>
</tr>
<tr><td>avx512spr</td>
<td>AVX 512 target (Sapphire Rapids Xeon CPUs, 4th generation Xeon Scalable)</td>
</tr>
<tr><td>neon</td>
<td>ARM NEON</td>
</tr>
<tr><td>sse2</td>
<td>SSE2 (early 2000s era x86 CPUs)</td>
</tr>
<tr><td>sse4.1</td>
<td>SSE4.1 (2007 Intel codename Penryn CPUs)</td>
</tr>
<tr><td>sse4.2</td>
<td>SSE4.2 (2008-2010 Intel codename Nehalem CPUs)</td>
</tr>
<tr><td>gen9</td>
<td>Intel Gen9 GPU</td>
</tr>
<tr><td>xelp</td>
<td>Intel XeLP GPU</td>
</tr>
<tr><td>xehpg</td>
<td>Intel Arc GPU</td>
</tr>
<tr><td>xehpc</td>
<td>Intel Ponte Vecchio GPU</td>
</tr>
</tbody>
</table>
<p>Consult your CPU's manual for specifics on which vector instruction set it
supports.</p>
<p>The mask size may be 8, 16, 32, or 64 bits, though not all combinations of ISA
and mask size are supported. For best performance, the best general
approach is to choose a mask size equal to the size of the most common
data type in your programs. For example, if most of the computations are done using
32-bit floating-point values, an <tt class="docutils literal">i32</tt> target is appropriate. However,
if you're mostly doing computation with 8-bit data types, <tt class="docutils literal">i8</tt> is a better choice.</p>
<p>See <a class="reference internal" href="#basic-concepts-program-instances-and-gangs-of-program-instances">Basic Concepts: Program Instances and Gangs of Program Instances</a> for
more discussion of the "gang size" and its implications for program
execution.</p>
<p>The naming scheme for compilation targets changed in August 2013; the
following table shows the relationship between names in the old scheme and
in the new scheme:</p>
<table border="1" class="docutils">
<colgroup>
<col width="54%" />
<col width="46%" />
</colgroup>
<tbody valign="top">
<tr><td>Target</td>
<td>Former Name</td>
</tr>
<tr><td>avx1-i32x8</td>
<td>avx, avx1</td>
</tr>
<tr><td>avx1-i32x16</td>
<td>avx-x2</td>
</tr>
<tr><td>avx2-i32x8</td>
<td>avx2</td>
</tr>
<tr><td>avx2-i32x16</td>
<td>avx2-x2</td>
</tr>
<tr><td>neon-8</td>
<td>n/a</td>
</tr>
<tr><td>neon-16</td>
<td>n/a</td>
</tr>
<tr><td>neon-32</td>
<td>n/a</td>
</tr>
<tr><td>sse2-i32x4</td>
<td>sse2</td>
</tr>
<tr><td>sse2-i32x8</td>
<td>sse2-x2</td>
</tr>
<tr><td>sse4.2-i32x4</td>
<td>sse4</td>
</tr>
<tr><td>sse4.2-i32x8</td>
<td>sse4-x2</td>
</tr>
<tr><td>sse4.2-i8x16</td>
<td>n/a</td>
</tr>
<tr><td>sse4.2-i16x8</td>
<td>n/a</td>
</tr>
</tbody>
</table>
<p>The full list of supported targets is below.</p>
<p>x86 targets:</p>
<p><tt class="docutils literal"><span class="pre">sse2-i32x4</span></tt>, <tt class="docutils literal"><span class="pre">sse2-i32x8</span></tt>, <tt class="docutils literal"><span class="pre">sse4.1-i8x16</span></tt>, <tt class="docutils literal"><span class="pre">sse4.1-i16x8</span></tt>, <tt class="docutils literal"><span class="pre">sse4.1-i32x4</span></tt>,
<tt class="docutils literal"><span class="pre">sse4.1-i32x8</span></tt>, <tt class="docutils literal"><span class="pre">sse4.2-i8x16</span></tt>, <tt class="docutils literal"><span class="pre">sse4.2-i16x8</span></tt>, <tt class="docutils literal"><span class="pre">sse4.2-i32x4</span></tt>, <tt class="docutils literal"><span class="pre">sse4.2-i32x8</span></tt>,
<tt class="docutils literal"><span class="pre">avx1-i32x4</span></tt>, <tt class="docutils literal"><span class="pre">avx1-i32x8</span></tt>, <tt class="docutils literal"><span class="pre">avx1-i32x16</span></tt>, <tt class="docutils literal"><span class="pre">avx1-i64x4</span></tt>, <tt class="docutils literal"><span class="pre">avx2-i8x32</span></tt>,
<tt class="docutils literal"><span class="pre">avx2-i16x16</span></tt>, <tt class="docutils literal"><span class="pre">avx2-i32x4</span></tt>, <tt class="docutils literal"><span class="pre">avx2-i32x8</span></tt>, <tt class="docutils literal"><span class="pre">avx2-i32x16</span></tt>, <tt class="docutils literal"><span class="pre">avx2-i64x4</span></tt>,
<tt class="docutils literal"><span class="pre">avx2vnni-i32x4</span></tt>, <tt class="docutils literal"><span class="pre">avx2vnni-i32x8</span></tt>, <tt class="docutils literal"><span class="pre">avx2vnni-i32x16</span></tt>,
<tt class="docutils literal"><span class="pre">avx512knl-x16</span></tt>, <tt class="docutils literal"><span class="pre">avx512skx-x4</span></tt>, <tt class="docutils literal"><span class="pre">avx512skx-x8</span></tt>, <tt class="docutils literal"><span class="pre">avx512skx-x16</span></tt>, <tt class="docutils literal"><span class="pre">avx512skx-x32</span></tt>,
<tt class="docutils literal"><span class="pre">avx512skx-x64</span></tt>, <tt class="docutils literal"><span class="pre">avx512icl-x4</span></tt>, <tt class="docutils literal"><span class="pre">avx512icl-x8</span></tt>, <tt class="docutils literal"><span class="pre">avx512icl-x16</span></tt>, <tt class="docutils literal"><span class="pre">avx512icl-x32</span></tt>,
<tt class="docutils literal"><span class="pre">avx512icl-x64</span></tt>, <tt class="docutils literal"><span class="pre">avx512spr-x4</span></tt>, <tt class="docutils literal"><span class="pre">avx512spr-x8</span></tt>, <tt class="docutils literal"><span class="pre">avx512spr-x16</span></tt>, <tt class="docutils literal"><span class="pre">avx512spr-x32</span></tt>,
<tt class="docutils literal"><span class="pre">avx512spr-x64</span></tt>.</p>
<p>Neon targets:</p>
<p><tt class="docutils literal"><span class="pre">neon-i8x16</span></tt>, <tt class="docutils literal"><span class="pre">neon-i16x8</span></tt>, <tt class="docutils literal"><span class="pre">neon-i32x4</span></tt>, <tt class="docutils literal"><span class="pre">neon-i32x8</span></tt>.</p>
<p>Xe targets:</p>
<p><tt class="docutils literal"><span class="pre">gen9-x8</span></tt>, <tt class="docutils literal"><span class="pre">gen9-x16</span></tt>, <tt class="docutils literal"><span class="pre">xelp-x8</span></tt>, <tt class="docutils literal"><span class="pre">xelp-x16</span></tt>, <tt class="docutils literal"><span class="pre">xehpg-x8</span></tt>, <tt class="docutils literal"><span class="pre">xehpg-x16</span></tt>, <tt class="docutils literal"><span class="pre">xehpc-x16</span></tt>, <tt class="docutils literal"><span class="pre">xehpc-x32</span></tt>.</p>
<p>Note that <tt class="docutils literal">sse4.1</tt> and <tt class="docutils literal">sse4.2</tt> targets may not be used together in
multi-target compilation. While the auto-dispatch code will correctly detect
the difference between these two ISAs, they both yield a binary with <tt class="docutils literal">sse4</tt>
suffix. This limitation is to maintain backward compatibility with build
systems expecting <tt class="docutils literal">sse4</tt> suffix.</p>
<p>Finally, <tt class="docutils literal"><span class="pre">--target-os</span></tt> selects the target operating system. Depending on
your host <tt class="docutils literal">ispc</tt> may support Windows, Linux, macOS, Android, iOS and PS4/PS5
targets. Running <tt class="docutils literal">ispc <span class="pre">--help</span></tt> and looking at the output for the <tt class="docutils literal"><span class="pre">--target-os</span></tt>
option gives the list of supported targets. By default <tt class="docutils literal">ispc</tt> produces the
code for your host operating system.</p>
<pre class="literal-block">
ispc foo.ispc -o foo.obj --target-os=android