Commits on Source
5110
0b02aada
Ignore more flake8 warnings
Sep 20, 2015
bf224d20
Fix cmdline edit_kernel control
Sep 20, 2015
b8ba9581
Tweaks to schedule debugging
Sep 20, 2015
526a434e
Merge bodge:src/loopy
Sep 21, 2015
339c46f9
Bring in @jdsteve's fix for not counting multiplication by (-1)
Sep 21, 2015
8359a6ad
Add some logic to generate a richer loop_insn_dep_map, allowing the scheduler...
Sep 22, 2015
59e9f4e3
Merge bodge:src/loopy
Sep 22, 2015
645daafb
Avoid naming prefetch buffers from add_prefetch() *_fetch_0
Sep 22, 2015
a2c876b0
Merge bodge:src/loopy
Sep 22, 2015
c3e2bab4
Doctest fix
Sep 22, 2015
ffca5f0e
Doctest fix
Sep 22, 2015
b32bc371
Improve scheduler debug output
Sep 22, 2015
de650f45
Remove spurious debug assert
Sep 22, 2015
94656e0f
Track call sig change to dump_schedule
Sep 22, 2015
9620c617
Fix computation of check domains in the presence of boosting
Sep 22, 2015
9a3574b7
Allow variable-length arrays at least during transformation
Sep 22, 2015
7d33b5c7
experimenting with reg usage estimator
Sep 30, 2015
e2ba5adb
fixing merge conflicts
Sep 30, 2015
0f5e659a
Ignore more errors
Oct 01, 2015
82a249d5
Parse C single prec float literals out of OCCA defs
Oct 01, 2015
1b169516
Manage include path for Fortran preproc
Oct 01, 2015
615e1946
Pass kernel (instead of target) to manglers and preamble generators
Oct 08, 2015
ac980ba8
POCL *is* now mature enough to be used as reference in automatic testing
Oct 08, 2015
b23ff0e1
Implement indexof, indexof_vec
Oct 08, 2015
aa13e867
Fix vector_dtype override
Oct 08, 2015
a5914298
Make scheduling more deterministic
Oct 09, 2015
b6c3147c
Fix doctests after determinism changes
Oct 09, 2015
639e9a8a
working on reg usage estimator, still in progress
Oct 13, 2015
70194c9e
Merge remote-tracking branch 'upstream/master'
Oct 13, 2015
ffec7cab
added op counter that distinguishes operations, reg counter still in progress
Oct 14, 2015
4870c190
reg usage estimator in somewhat usable state
Oct 16, 2015
b554d9bc
Whitespace fix
Oct 17, 2015
de300e15
add_prefetch: Fix expand_subst call that gets rid of temporary substitution rule
Oct 17, 2015
9f9219e7
Allow setting suffixes for disambiguation of redundant names in fused kernels
Oct 17, 2015
e8a456e2
Fix: Allow setting suffixes for disambiguation of redundant names in fused kernels
Oct 18, 2015
66739211
Make kernel print order deterministic
Oct 18, 2015
1845b517
Placate flake8
Oct 18, 2015
ee6eee91
Catch and prevent order ambiguity when specifying precompute_inames
Oct 18, 2015
ae40aa62
Apply sorting in various spots to make code generation deterministic
Oct 18, 2015
bcedf75d
Don't use caching in CI runs
Oct 18, 2015
400b1c92
Fix reduction library for new mangler/preamble generator interface
Oct 18, 2015
baffa960
Don't expose l.auto to the user, yell at them if they use it (Fix #13)
Oct 18, 2015
d28d2e7d
Remove stray debug print
Oct 18, 2015
c83fc76b
Determinism fixes to automatic axis assignments
Oct 22, 2015
8f5bb4b2
updated tutorial, cleaning up code for merge with upstream
Oct 26, 2015
89a3cdd3
Merge remote-tracking branch 'upstream'
Oct 26, 2015
fe3497e7
removed old ExpressionOpCounter
Oct 26, 2015
4b1933db
removed get_op_counter_old
Oct 26, 2015
6622a32d
moved reg counter to perf model
Oct 26, 2015
6bcce4ec
Merge pull request
#14
from jdsteve2/master
Oct 26, 2015
7364c0e6
Add is_expression_equal, use to accept that 2+n and n+2 are the same shape
Nov 04, 2015
bd3222c7
added version of add_and_infer_dtypes that accepts type dicts with unused variables
Nov 21, 2015
9f22be28
Merge pull request
#16
from jdsteve2/master
Nov 21, 2015
80856821
Add 'realize_ilp' transformation
Nov 25, 2015
40de04de
Remove documentation for l.auto, which is no longer user-visible
Nov 27, 2015
bf3340f7
Improve iname tags docs
Nov 27, 2015
7c510c29
Precompute: Only assign automatic axes if automatic axis were created
Nov 28, 2015
c8f49034
Placate PEP8
Nov 28, 2015
e9852a20
Document dim_tags syntax, introduce optional dim_tags
Nov 28, 2015
56ab4189
Minor doc fix
Nov 28, 2015
0307cb95
Accept more iterables for sweep_inames in add_prefetch()
Nov 28, 2015
f8fcbec2
Modernize cleanups
Nov 28, 2015
5ac94e9e
Try two strategies for finding base indices/lengths
Nov 28, 2015
2c0311fa
Update loopy/__init__.py __all__ list
Nov 29, 2015
6fed8c2b
Teach tutorial about StaticValueFindingError
Nov 29, 2015
9b00fd35
Allow setting array dimension names, make use of them when naming things
Nov 30, 2015
b7ecaa8a
Add ignore_nonexistent kwarg to tag_inames
Nov 30, 2015
d32da35b
Typo fix
Nov 30, 2015
5608ba23
Don't print insn dependencies by default
Nov 30, 2015
2f343708
Fix lhs/rhs reversal typo, allow vectorizing 'vector = scalar'
Nov 30, 2015
886baacc
Minor tweaks to dependencies-non-printing
Nov 30, 2015
8cb367db
More kernel printing/tutorial tweaks
Nov 30, 2015
3d911f26
Restructure transforms code
Nov 30, 2015
bbfc3fa7
More transform shuffling
Nov 30, 2015
1c732a8f
Second half of previous transform jostling commit
Nov 30, 2015
0b5d34f0
Add missing import
Nov 30, 2015
4412cab0
Initial version of distributive law transform
Nov 30, 2015
a4bf021a
Use Subscript.index_tuple in more places
Dec 01, 2015
fe92b233
Leave a note about a possible error check
Dec 01, 2015
dfd763c2
Improve error checking in ArrayChanger
Dec 01, 2015
73b3ab13
Working version of the distributive law transform
Dec 01, 2015
3864505d
Add test for distributive law transform
Dec 01, 2015
ad7baaf2
Add initial version of gNUMA test
Dec 01, 2015
e9943a32
Add opt levels, make gNUMA test executable
Dec 02, 2015
f83de483
Minor loopy pyinstaller tweak
Dec 07, 2015
dea64f61
First steps towards an ISPC target
Dec 07, 2015
5710fca2
Finish ISPC backend
Dec 07, 2015
cd4df8a7
Use cgen from git
Dec 07, 2015
c4c1b6d3
Flatten target submodules, tweak CUDA backend
Dec 07, 2015
8dc0ea33
Support choosing target from the command line
Dec 07, 2015
b7651913
Fix set_default_target initial setup
Dec 08, 2015
8ebcc066
Fix CL image arguments
Dec 08, 2015
6211cd14
Fix test failure: revert OpenCL barrier spacing style
Dec 08, 2015
ad7288f5
ExpressionInstruction -> Assignment
Dec 16, 2015
61a833a8
Import cleanup
Jan 07, 2016
405c789e
Pass kernel to get_{global,local}_axis_expr, use signed indices on CUDA
Jan 07, 2016
fa3d3d66
CUDA target: generate launch bounds, extern C
Jan 07, 2016
19c24c8c
Bump cache version
Jan 07, 2016
8ba03bf4
Use cgen from git in old PyOpenCL build
Jan 07, 2016
0ebd02f1
Introduce placeholders for hw axes, rather than using target-specific expressions
Jan 07, 2016
5,010 additional commits have been omitted to prevent performance issues.
Loading
Loading