Memory

memory.param

Define low-level memory settings for compute devices.

Settings for memory layout for supercells and particle frame-lists, data exchanges in multi-device domain-decomposition and reserved fields for temporarily derived quantities are defined here.

namespace picongpu

Typedefs

using picongpu::SuperCellSize = typedef typename mCT::shrinkTo< mCT::Int< 8, 8, 4 >, simDim >::type

size of a superCell

volume of a superCell must be <= 1024

using picongpu::MappingDesc = typedef MappingDescription< simDim, SuperCellSize >

define mapper which is used for kernel call mappings

using picongpu::GuardSize = typedef typename mCT::shrinkTo< mCT::Int< 1, 1, 1 >, simDim >::type

define the size of the core, border and guard area

PIConGPU uses spatial domain-decomposition for parallelization over multiple devices with non-shared memory architecture. The global spatial domain is organized per device in three sections: the GUARD area contains copies of neighboring devices (also known as “halo”/”ghost”). The BORDER area is the outermost layer of cells of a device, equally to what neighboring devices see as GUARD area. The CORE area is the innermost area of a device. In union with the BORDER area it defines the “active” spatial domain on a device.

GuardSize is defined in units of SuperCellSize per dimension.

Variables

constexpr size_t picongpureservedGpuMemorySize = 350 *1024*1024
constexpr uint32_t picongpufieldTmpNumSlots = 1

number of scalar fields that are reserved as temporary fields

constexpr bool picongpufieldTmpSupportGatherCommunication = true

can FieldTmp gather neighbor information

If true it is possible to call the method asyncCommunicationGather() to copy data from the border of neighboring GPU into the local guard. This is also known as building up a “ghost” or “halo” region in domain decomposition and only necessary for specific algorithms that extend the basic PIC cycle, e.g. with dependence on derived density or energy fields.

struct picongpuDefaultExchangeMemCfg

bytes reserved for species exchange buffer

This is the default configuration for species exchanges buffer sizes. The default exchange buffer sizes can be changed per species by adding the alias exchangeMemCfg with similar members like in DefaultExchangeMemCfg to its flag list.

Public Static Attributes

constexpr uint32_t picongpu::DefaultExchangeMemCfgBYTES_EXCHANGE_X = 1 * 1024 * 1024
constexpr uint32_t picongpu::DefaultExchangeMemCfgBYTES_EXCHANGE_Y = 3 * 1024 * 1024
constexpr uint32_t picongpu::DefaultExchangeMemCfgBYTES_EXCHANGE_Z = 1 * 1024 * 1024
constexpr uint32_t picongpu::DefaultExchangeMemCfgBYTES_EDGES = 32 * 1024
constexpr uint32_t picongpu::DefaultExchangeMemCfgBYTES_CORNER = 8 * 1024

precision.param

Define the precision of typically used floating point types in the simulation.

PIConGPU normalizes input automatically, allowing to use single-precision by default for the core algorithms. Note that implementations of various algorithms (usually plugins or non-core components) might still decide to hard-code a different (mixed) precision for some critical operations.

mallocMC.param

Fine-tuning of the particle heap for GPUs: When running on GPUs, we use a high-performance parallel “new” allocator (mallocMC) which can be parametrized here.

namespace picongpu

Typedefs

using picongpu::DeviceHeap = typedef mallocMC::Allocator< mallocMC::CreationPolicies::Scatter< DeviceHeapConfig >, mallocMC::DistributionPolicies::Noop, mallocMC::OOMPolicies::ReturnNull, mallocMC::ReservePoolPolicies::SimpleCudaMalloc, mallocMC::AlignmentPolicies::Shrink<> >

Define a new allocator.

This is an allocator resembling the behaviour of the ScatterAlloc algorithm.

struct picongpuDeviceHeapConfig

configure the CreationPolicy “Scatter”

Public Types

using picongpu::DeviceHeapConfigpagesize = boost::mpl::int_<2 * 1024 * 1024>

2MiB page can hold around 256 particle frames

using picongpu::DeviceHeapConfigaccessblocks = boost::mpl::int_<4>

accessblocks, regionsize and wastefactor are not conclusively investigated and might be performance sensitive for multiple particle species with heavily varying attributes (frame sizes)

using picongpu::DeviceHeapConfigregionsize = boost::mpl::int_<8>
using picongpu::DeviceHeapConfigwastefactor = boost::mpl::int_<2>
using picongpu::DeviceHeapConfigresetfreedpages = boost::mpl::bool_<true>

resetfreedpages is used to minimize memory fragmentation with varying frame sizes