======================================= Now Available: Cray Chapel Compiler 1.0 ======================================= Cray is pleased to announce the Cray Chapel Compiler release 1.0. Chapel is a new parallel programming language designed to improve productivity of high-end computer users while also serving as a portable parallel programming model for use on commodity clusters or desktop multi-core system. ----------------------------- About Chapel and this release ----------------------------- Chapel is a new parallel programming language being developed by Cray Inc. with the goal of improving programmer productivity. Chapel's implementation is very much a work-in-progress and one that Cray is undertaking under the open-source BSD license. We encourage feedback from potential users in order to help improve Chapel's usefulness, generality, and adoptability. We are also interested in seeking out collaborations with external research groups. The highlights of this release include: data parallelism for operations on ranges, local domains, and local arrays; increased stability and completeness of the Block distribution; improved memory utilization in the generated executable; support for PrgEnv-cray as a back-end C compiler; improvements to the PBS launcher; general performance and correctness improvements; and the inclusion of editor modes for emacs and vim. For a full list of changes in the release, please refer to $CHPL_HOME/CHANGES after loading the chapel module. This release of Chapel contains stable support for the base language and for task parallelism across multiple nodes. It supports Chapel's data parallel features for regular ranges, domains, and arrays on single nodes and on multiple nodes using a Block distribution. Data parallel features on irregular domains and arrays are supported in a single-threaded, single-node reference implementation. Preliminary support for multi-node cyclic and block-cyclic distributions is also provided. While performance has received a great deal of attention in Chapel's design, this release lacks a number of crucial performance optimizations and is not suitable for in-depth performance comparisons and studies. The Chapel release is made available under the BSD license and the user agreement attached below. To get started with Chapel, download and unpack Chapel and then refer to the top-level README file for 'Quick Start' instructions and pointers to next steps such as example programs and the language specification. For more information on Chapel beyond what's contained in the release, please refer to the Chapel webpage at http://chapel.cray.com and our SourceForge project page at http://sourceforge.net/projects/chapel. ------------------------------------------- Chapel Public Release User Agreement (v1.0) ------------------------------------------- This release is being distributed under the BSD License (see LICENSE) and the user agreement below. By using this release of Chapel I acknowledge that I have read and understand the following: * Chapel is a work-in-progress and not a finished product. This release of Chapel is being made available to give the general public insight into that work and to encourage feedback that will improve it in the future. Feedback should be directed to chapel_info@cray.com. * Although Chapel has been designed with performance in mind, performance optimizations have not been an implementation priority to date. This release of Chapel is not intended for performance studies or comparisons. * Although the Chapel team will do what they can to support this release, no formal support is available at this time. ---------------------- Changes in version 1.0 ---------------------- Third public release of Chapel compiler, October 15, 2009 High-Level Themes: ------------------ - multi-locale task parallelism - improved support for single- and multi-locale data parallelism - improved stability and portability - improved memory utilization of compiler-generated code - target audience: general public Environment Changes: -------------------- - added emacs/vim modes to release -- see $CHPL_HOME/etc/README Changes to Chapel Language: --------------------------- - removed implicit coercions from primitive types to strings to avoid confusion - a default array variable can now be made to alias another via the => operator - accesses to variable x in module M using 'M.x' must now follow a 'use M' Newly Implemented Features: --------------------------- - forall loops over ranges & arithmetic domains/arrays are now parallelized - improved support for and correctness of record and class destructors - array declaration+initialization syntax now results in parallel evaluation e.g., var A: [i in D] real = foo(i); will be evaluated in parallel - added == and != for imag and complex types; added >, >=, <, <= for imag types Portability of code base: ------------------------- - improved support for the Cray compiler on XT systems (cray-xt-cray) - reduced warnings for gcc > 4.3 - improved portability with respect to Intel icc/icpc v11.x - removed outdated assumptions about Sun compiler environments - removed the makechpl script for Mac users because of portability issues Platform-specific notes: ------------------------ - added a PBS launcher for the Cray CX1 named pbs-gasnetrun_ibv Launcher-specific notes: ------------------------ - several improvements to the pbs launcher (see README.launcher/README.xt-cle) - environment variables are now propagated to the application - a queue can be specified via the CHPL_LAUNCHER_QUEUE environment variable - a wallclock limit can be specified via CHPL_LAUNCHER_WALLTIME - the NCCS pbs launcher no longer uses the debug queue by default - added support for CHPL_LAUNCHER_SUFFIX to launch a file other than ..._real Semantic Changes: ----------------- - changed distributions from having class/reference semantics to value semantics - made module initialization occur at program startup rather than use statements - only modules specified on the command-line are candidates for the main module - added support for returning locally scoped arrays from variable functions - changed interpretation of method definitions on scalar types e.g., 'def int.foo()' now defines foo() for default-sized ints, not all ints Syntactic/Naming Changes: ------------------------- - renamed MultiBlockDist.chpl to BlockDist.chpl - removed the Block1D distribution since Block subsumed it - added placeholder notation for creating new distribution values e.g., new Block(...) => distributionValue(new Block(...)) - renamed the pbs launcher for Cray XT to pbs-aprun since it wraps both packages Compiler Changes: ----------------- - improved support for slicing [strided] domains/arrays with [strided] slices - improved flushing of writeln() statements to the file being written to - removed support for goto from the compiler's front-end Runtime Library Changes: ------------------------ - improved pthread setup, termination, and cleanup for non-erroneous exits - refactored threading runtime to support code reuse for pthread-like threads - added support for memory tracking for multi-locale executions Documentation: -------------- - improved the Types, Modules, and Ranges chapters of the language specification - added mention of 'delete' to language specification - improved the Label, Break, and Continue subsection of the language spec - minor changes to other chapters of the language specification - updated README.xt-cle and README.launcher to reflect new pbs features - updated the various READMEs to reflect minor changes and wording Example Codes: -------------- - changed fft to use a Block distribution - changed reference to MultiBlockDist module in block2D.chpl to BlockDist - changed distributions to use the placeholder value type notation - changed default value of tasksPerLocale in HPCC benchmarks to avoid reductions - changed RA's constant array m2 into a constant scalar for performance reasons - changed follower iterator in ra-randstream.chpl to accept tuple of ranges - deleted classes in example programs to reclaim memory - increased problem size for reductions.chpl to avoid bug w/ 5+ cores per locale Standard Modules: ----------------- - added printMemStat() to the standard Memory module; improved printMemTable() - added start/stopVerboseMem[Here]() to the Memory module for tracing memory use - improved reference counting of domains and arrays - removed the (undocumented) Schedules module Standard Distributions: ----------------------- - merged Block1D and Block since the latter subsumed the former - removed the default rank of 1 for the Block distribution - added support for a multidimensional target array of locales to Block - improved support for strided domains/arrays in the Block distribution - added support for slicing to the Block distribution - added support for member(), order(), and position() to the Block distribution - added initial support for a Cyclic distribution - added very preliminary support for a Block-Cyclic distribution - improved the support for the CSR distribution to match the default sparse case - unified leader/follower iterators to always work on tuples of ranges - removed subBlocks() from the standard distribution interface Compiler Flags: --------------- - added support for a module path flag (-M) to search for modules via filenames - added a flag to print the module search path (--print-search-dirs) - added a flag to print module files being parsed (--print-module-files) - added support for a -I flag to specify a search directory for C headers Generated Code Flags: --------------------- - added support for specifying configuration variables/constants without = e.g., you can now use './a.out --n 4' in addition to ./a.out --n=4' - improved flags for tracking memory utilization (see README.executing) - improved error messages to indicate the argument number - made compiler-generated generic type names deterministic - improved robustness of --numLocales flag Bug Fixes/New Semantic Checks (for old semantics): -------------------------------------------------- - added an error for using => on non-array types - added an error for using (...x) on non-tuple types - added a semantic check against tuples sized by 0 or a negative value - made labels on statements other than serial loops be errors - made break and continue only applicable within serial loops - improved error checking when assigning between ranges of different boundedness - fixed a bug in which breaks in serial loops gave errors in parallel contexts - fixed a bug in which tuple copies sometimes aliased the original tuple - fixed a bug in which generic fields were incorrectly aliased in constructors - fixed a bug in which we were accidentally supporting illegal parameter casts - fixed a bug in which string parameter members broke the compile - fixed a bug in which indices were inadvertantly shared/non-local in promotions - fixed a bug in which pbs launchers did not work with shell prompts ending in $ - fixed a bug in which the compiler attempted to clone external functions - for Cray XT, fixed default setting of GASNET_MAX_SEGSIZE to specify size in KB - fixed a race in the creation of private, replicated distribution classes - fixed a bug in which tensor iteration resulted in internal errors - removed a subtle race condition in program startup - fixed a bug in which we called default constructrs by name - fixed deletion of list elements in List module - added support for generating the implicit Program module in --html view Error Message Improvements: --------------------------- - fixed line numbers in errors involving dynamic dispatch and [con/de]structors Compiler Analysis and Optimizations/Performance of Generated Code: ------------------------------------------------------------------ - vastly reduced amount of memory leaked due to compiler-allocated data - improved performance of loops using Block distributions - improved performance and reduced memory requirements for memory tracking Cleanup of Generated Code: -------------------------- - embedded information about compilation options to the generated code Testing system: --------------- - improved precedence of execution options specified via .execopts or EXECOPTS - made parallel testint place -nl x flags at the end of the command line - added support for PVM-based testing to the test script Internal: --------- - replaced uses of "[unsigned] long long" with [u]int64_t for sane portability - some unification of reserved names, though more remains - improved Chapel's launcher runtime interface to be more general - added a mechanism for intercepting printf/fprintf calls if required - Makefile refactorings working toward supporting parallel make - refactored runtime/mem- directories to decrease duplicated code - renamed runtime files to improve standardization, though more remains - removed linked list pointers from memory tracking table - reduced amount of runtime code linked into the launcher binary - made the use of chpl_globals_registry more consistent across locales - relaxed compiler assumptions about classes with the "data class" pragma - added support for generating type and offset information for communications - some initial work toward supporting execution on heterogeneous architectures - some initial work toward supporting CPU<->GPU computations in Chapel - some initial work toward supporting profiling tools with Chapel - removed "valid var" pragma - made wrapper functions use blank intent rather than inheriting from wrappee - changed strategy for determining when value types should be copied/destroyed - made domain and array classes always have reindexed set to true by default - added a developer flag for disabling compiler-introduced memory frees - removed support for _init functions from the compiler - removed assumptions that replicated global constants are stored symmetrically - added thread cancel and join functions to the threading runtime interface - added a type, chpl_threadID_t to pass thread IDs between C and Chapel - changed point at which variables are put on the heap - made all built-in modules be filename based - refactored directory structure of $CHPL_HOME/modules into standard/internal... - added optimizations to remove unnecessary autocopy/autodestroy calls - improved robustness of internal modules that use standard ones - changed the scan implementation to generate an array rather than a list - changed array assignments to use array iteration rather than random access - made --no-cpp-lines the default for -g for developers - improved handling, robustness of built-in configuration variables - split chpl_comm_md.h into two files to permit platform- and comm- overrides