Commit a22edae1 by Martin

Release 2022.1

0 parents
Showing with 4870 additions and 0 deletions

Too many changes to show.

To preserve performance only 1000 of 1000+ files are displayed.

daiteq demo (RISC-V)
====================
Overview
--------
This project contains examples and simple programs for testing the functionality of
daiteq FPU and SWAR extensions in the RISC-V CPU (with BSP for NOEL-V systems).
A specific toolchain and compiler is required for building the examples.
It can be checked out with the following commands
$ riscv64-daiteq-elf-as --version
$ clang --version
Directory structure
-------------------
daiteq-demo
|- examples - complex examples which can be executed in the QEMU emulator or in the board
|- libs - libraries required for building the examples (BSP, libc, libm, ...)
|- scripts - auxiliary python scripts for building tests and examples
\- simple - set of simple programs which test individual compiler features for
RISC-V CPU with the daiteq extensions
Simple programs
---------------
Simple tests are placed in the directory *simple*. The directory contains a ``Makefile``
so all tests can be built directly with the ``make`` command.
$ cd simple
$ make
The building process creates the *BUILD* directory in the *simple* directory.
Each built test is placed in an individual subdirectory in the *BUILD* directory.
The directory contains all generated files. If all the compilation steps have finished
successfully, passed the directory also contains a file named ``done``.
The directory contains an output for each step of the compilation, including the
internal files generated during compilation, and a file with an error output.
The directory contains the following files:
*.ast - output of the clang compiler frontend
*.bc - bitcode of the LLVM Intermediate Representation
*.ll - human readable LLVM Intermediate Representation
*.dis_ll - human readable LLVM Intermediate Representation disassembled from *.bc
*.S - target assembler code - output of the compiler
*.o - assembled target object file (from the *.S file using the target assembler)
*.dis - disassembled code from the output object file (*.o)
Examples
--------
The directory *examples* contains complex programs which can be built for the target
board or for the QEMU emulator.
The directory contains a ``Makefile`` and a bash script for building programs which are
stored in individual subdirectories.
Before compilation of any example the libraries must be once initialised with script
'./libs/init.sh'. The script clones and patches 'newlib' library from a remote git repository.
The examples can be compiled with the ``Makefile`` in two ways. One is to use the bash
script and the recipe file with build configurations. An example of such a file is ``test_list.txt``.
This file contains compilation of all the examples for several configurations.
The following commands start compilation from the recipe
$ cd examples
$ ./test.sh build ./test_list.txt
The recipe file contains lines with settings, for example, which linker script is to be
used, which compilation flags will be used, etc.
Each line for each example contains three or more parametrs.
The first parameter is the name of the example (name of the directory that contains the example).
The second parameter is a user identification number which distinguishes compiled programs;
it is useful if the same program is compiled many times with different build options.
The third parameter is the selected daiFPU or SWAR configuration for which the is program built.
All other parameters are used as compiler options.
More detailed description of the recipe file syntax is included in 'test.sh' script.
The second way how to compile a program is a direct execution of the make tool.
$ cd examples
$ make TEST=<program> ID=<user ID> FPUCFG=<fpu_configuration> LDSCRIPT=<linker_script> UCFLAGS='<all other compilation options>'
Both ways create a directory *BUILD* that contains subdirectories named according to the selected daiFPU or SWAR configurations.
The subdirectory contains compiled examples stored as ELF and DAT files.
The compiled output is disassembled, and three log files are produced that contain a summary of hardware FPU instructions generated
and soft-float functions called.
There are also other subdirectories that contain object files compiled from the used libraries.
There are two files with recipes for all complex programs. The first file (test_list.txt)
contains complex programs that test and demonstrate the use of the daiFPU floating-point unit.
The second file (test_list_swar.txt) contains examples for testing the daiteq SWAR extension.
All examples except the linpack program are compiled for 128KB or 256KB
RAM, which is the standard memory size configured for the Virtex5 target in the LEON2FT version with the daiteq extensions.
The linpack program does not fit in 256KB RAM, so it is compiled for 4MB RAM.
ifeq ($(BDIR),)
BDIR=BUILD
endif
# to avoid clear all files
ID?=0000
F2RM=$(shell find ./$(BDIR) -maxdepth 2 -name "$(ID)*")
clean:
@for i in $(F2RM); do echo remove $$i; rm -rf $$i; done
distclean:
rm -rf $(BDIR)/*
# Main makefile for building test with llvm/binutils toolchain
TOP:=$(shell pwd)
TOOL=llvm
TOOLCHAIN=riscv64-daiteq-elf
CC=clang
LC=llc
LD=$(TOOLCHAIN)-ld
AS=$(TOOLCHAIN)-as
AR=$(TOOLCHAIN)-ar
OBJDUMP=$(TOOLCHAIN)-objdump
RANLIB=llvm-ranlib
INSPECT=$(TOP)/../scripts/inspectfp_rv.py
# default linker script
DEFLNKSCR=scripts/qemu_noelv.lds
# source codes of libraries
LIBDIR=$(abspath $(TOP)/../libs)
# added objects (from BSP/LIBBCC library)
ADDOBJS=bcc/first.o bcc/crt0.o
LIBFLAGS=-ggdb
# include project settings from a specific test
include $(TEST)/Makefile
$(info * Process in Makefile.llvm)
#$(info . TOP=$(TOP))
#$(info . BDIR=$(BDIR))
#$(info . DSUFFIX=$(DSUFFIX))
#$(info . TEST=$(TEST))
#$(info . LDS=$(LDS))
#$(info . ID=$(ID))
$(info . FPUCFG=$(FPUCFG))
$(info . SWARCFG=$(SWARCFG))
#$(info . MAKEVAR=$(MAKEVAR))
#$(info . UCFLAGS=$(UCFLAGS))
#$(info . ARCH=$(ARCH))
#$(info . ABI=$(ABI))
#$(info . ARCHCFLAGS=$(ARCHCFLAGS))
#$(info . * BIN=$(BIN))
#$(info . SOURCES=$(SOURCES))
#$(info . HEADERS=$(HEADERS))
#$(info . LIBS=$(LIBS))
ADD_ARCH_CCFLG=
ifneq ($(FPUCFG),)
ADD_ARCH_CCFLG+=-daiteq-fpu-type=$(FPUCFG)
endif
ifneq ($(SWARCFG),)
# ADD_ARCH_CCFLG+=-daiteq-fpu-type=$(FPUCFG)
endif
ARCH_CC_FLAGS:=-march=$(ARCH) -mabi=$(ABI) $(ADD_ARCH_CCFLG) $(ARCHCFLAGS)
#:%=-m%)
# assembler has not special extensions specified with ARCH (from version 220706 additional extensions are used)
#ARCH_AS_FLAGS:=-march=$(word 1,$(subst _, ,$(ARCH))) -mabi=$(ABI)
ARCH_AS_FLAGS:=-march=$(ARCH) -mabi=$(ABI)
$(info : ARCH_CC_FLAGS=$(ARCH_CC_FLAGS))
ifeq (n,$(findstring n,$(firstword -$(MAKEFLAGS))))
$(info RUN SHELL CMD: touch /tmp/dummy.c; $(CC) -print-sf-uid $(ARCH_CC_FLAGS) -c /tmp/dummy.c -o /dev/null 2>/dev/null)
endif
ARCHID:=$(shell touch /tmp/dummy.c; $(CC) -print-sf-uid $(ARCH_CC_FLAGS) -c /tmp/dummy.c -o /dev/null 2>/dev/null)
$(info : ARCHID=$(ARCHID))
ifeq ($(ARCHID),)
$(error ARCHID cannot be empty)
endif
LIBDSTDIR=$(BDIR)
DSTDIR=$(BDIR)/$(ARCHID)$(DSUFFIX)
PRGBIN=$(DSTDIR)/$(ID)-$(BIN).elf
PRGDIS=$(DSTDIR)/$(ID)-$(TEST)/$(BIN).dis
PRGINF=$(DSTDIR)/$(ID)-$(TEST)/$(BIN).inf
PRGINSPECT=$(DSTDIR)/$(ID)-$(TEST)/$(BIN).inspect
ifeq ($(LDS),)
LNKSCR=-T $(TOP)/$(DEFLNKSCR)
else
LNKSCR=-T $(TOP)/$(LDS)
endif
OBJS=$(SOURCES:%.c=$(DSTDIR)/$(ID)-$(TEST)/%.o)
LIBSAID:=$(foreach lib,$(LIBS),$(LIBDSTDIR)/lib$(lib)/$(ARCHID)/lib$(lib).a)
LIBDIRS:=$(foreach lib,$(sort $(LIBS)),$(LIBDSTDIR)/lib$(lib)/$(ARCHID))
AOBJDST=$(foreach ao,$(ADDOBJS),$(LIBDSTDIR)/lib$(dir $(ao))$(ARCHID)/$(notdir $(ao)))
# we need building directory, program directories and program binaries
all: $(DSTDIR)/$(ID)-$(TEST) $(PRGBIN) $(PRGDIS) $(PRGINF) $(PRGINSPECT)
@echo "done..."
$(DSTDIR)/$(ID)-$(TEST):
@mkdir -p $(DSTDIR)/$(ID)-$(TEST)
$(PRGINSPECT): $(PRGDIS) $(INSPECT)
@$(INSPECT) $< list > $@
$(PRGINF):
@echo "Compilation settings (llvm)" > $@
@echo ". TOP=$(TOP)" >> $@
@echo ". BDIR=$(BDIR)" >> $@
@echo ". DSUFFIX=$(DSUFFIX)" >> $@
@echo ". TEST=$(TEST)" >> $@
@echo ". LDS=$(LDS)" >> $@
@echo ". ID=$(ID)" >> $@
@echo ". FPUCFG=$(FPUCFG)" >> $@
@echo ". SWARCFG=$(SWARCFG)" >> $@
@echo ". MAKEVAR=$(MAKEVAR)" >> $@
@echo ". UCFLAGS=$(UCFLAGS)" >> $@
@echo ". ARCH=$(ARCH)" >> $@
@echo ". ABI=$(ABI)" >> $@
@echo ". ARCHCFLAGS=$(ARCHCFLAGS)" >> $@
@echo ". * BIN=$(BIN)" >> $@
@echo ". SOURCES=$(SOURCES)" >> $@
@echo ". HEADERS=$(HEADERS)" >> $@
@echo ". LIBS=$(LIBS)" >> $@
@echo "-> ARCHID=$(ARCHID)" >> $@
$(PRGDIS): $(PRGBIN)
@$(OBJDUMP) -S $< > $@
$(PRGBIN): $(LIBSAID) $(OBJS)
@echo "Link application $@"
@$(LD) --gc-sections $(LNKSCR) $(LIBDIRS:%=-L%) -o $@ $(AOBJDST) $(OBJS) $(LIBS:%=-l%)
# rules for building libraries
define GEN_LIB_RULE
$(LIBDSTDIR)/lib$(lib)/$(ARCHID)/lib$(lib).a: $(LIBDIR)/Makefile.lib$(lib)
@mkdir -p $(LIBDSTDIR)/lib$(lib)/$(ARCHID)
@make -C $(LIBDIR) -f Makefile.lib$(lib) all LIB=lib$(lib) LIBFILE=$(abspath $(LIBDSTDIR)/lib$(lib)/$(ARCHID))/lib$(lib).a TOP=$(abspath $(TOP)) DSTDIR=$(abspath $(LIBDSTDIR)/lib$(lib)/$(ARCHID)) CCARCH="$(ARCH_CC_FLAGS)" ASARCH="$(ARCH_AS_FLAGS)" ARCHID=$(ARCHID) TOOL=$(TOOL) TOOLCHAIN=$(TOOLCHAIN) LIBFLAGS=$(LIBFLAGS) CC=$(CC) LC=$(LC) LD=$(LD) AS=$(AS) AR=$(AR) RANLIB=$(RANLIB)
endef
$(foreach lib,$(sort $(LIBS)), \
$(eval $(GEN_LIB_RULE)) \
)
# rules for compiling object files
define GEN_OBJ_RULE
$(DSTDIR)/$(ID)-$(TEST)/$(src).o: $(DSTDIR)/$(ID)-$(TEST)/$(src).c_S
@echo "Assemble $$@ from $$<"
@$(AS) $(ARCH_AS_FLAGS) -gdwarf-5 -o $$@ $$<
$(DSTDIR)/$(ID)-$(TEST)/$(src).c_S: $(TOP)/$(TEST)/$(src).c $(HEADERS:%=$(TOP)/$(TEST)/%)
@echo "Compile $$@ from $$<"
@$(CC) $(ARCH_CC_FLAGS) $(CFLAGS) $(addprefix -D,$(DEFS)) $(UCFLAGS) $(addsuffix /include,$(addprefix -I$(LIBDSTDIR)/lib,$(sort $(LIBS)))) -fno-addrsig -S -o $$@ $$<
endef
#$(DSTDIR)/$(ID)-$(TEST)/$(src).c_S: $(DSTDIR)/$(ID)-$(TEST)/$(src).bc
# @echo "Compile $$@ from $$<"
# @$(LC) --target-abi=$(ABI) --filetype=asm -o $$@ $$<
#$(DSTDIR)/$(ID)-$(TEST)/$(src).bc: $(TOP)/$(TEST)/$(src).c $(HEADERS:%=$(TOP)/$(TEST)/%)
# @echo "Compile $$@ from $$<"
# @$(CC) -emit-llvm $(ARCH_CC_FLAGS) $(CFLAGS) $(addprefix -D,$(DEFS)) $(UCFLAGS) $(addsuffix /include,$(addprefix -I$(LIBDSTDIR)/lib,$(sort $(LIBS)))) -c -o $$@ $$<
$(foreach src,$(SOURCES:%.c=%), \
$(eval $(GEN_OBJ_RULE)) \
)
This is free and unencumbered software released into the public domain.
Anyone is free to copy, modify, publish, use, compile, sell, or
distribute this software, either in source code form or as a compiled
binary, for any purpose, commercial or non-commercial, and by any
means.
In jurisdictions that recognize copyright laws, the author or authors
of this software dedicate any and all copyright interest in the
software to the public domain. We make this dedication for the benefit
of the public at large and to the detriment of our heirs and
successors. We intend this dedication to be an overt act of
relinquishment in perpetuity of all present and future rights to this
software under copyright law.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR
OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
OTHER DEALINGS IN THE SOFTWARE.
For more information, please refer to <http://unlicense.org>
# Makefile for fft project
BIN=fft
TABLE_SIZE?=10
BLOCKSIZES?=1,2,4,8,16,30
# C sources
SOURCES=fft-run.c
HEADERS=tables/fft_table2_$(TABLE_SIZE)_data.h ../common_sys_header.inc
CFLAGS=-DNOHWCONF
# libraries in required order (compiled in the order)
LIBS=c m c bcc built
#tables/fft_table2_$(TABLE_SIZE)_data.h: gen_table_2.sh
# mkdir -p tables
# echo FFT_TABLE: $@ $<
# $(shell -c "env TABLE_SIZE=$(TABLE_SIZE) ./$<" >$@.tmp || rm -f $@.tmp)
# mv -f $@.tmp $@
This directory contains a copy of the FFT benchmark.
The program executes the FFT kernel across different FFT sizes, and
using different numbers of threads each time.
The performance measurements are reported on the UART output.
The program prints "Done!" at the end if the execution completed successfully.
The maximum FFT size that is tested is set by the `make`
variable TABLE_SIZE. By default this is set to 10, which means
the max FFT size is 1024 (2^10). Change with e.g.:
make TABLE_SIZE=8
The other parameter is the number of threads used to test. For each
size the program will try the sizes listed in the parameter
BLOCKSIZES. By default this is set to 1,2,4,8,16,30.
Change with e.g.:
make BLOCKSIZES=1,2,4,8
The command used to compile is defined by the `make` variable SLC.
This is set by default to `slc -b l2mt_f`.
//
// fft-run.c: this file is part of the SL program suite.
//
// Copyright (C) 2020-2022 daiteq.
//
//
#include <stdint.h>
#include <stdio.h>
#if !defined(__riscv) && (USEDBSP==BSP_RTEMS)
#define USE_RTEMS 1
#endif
#if USE_RTEMS
#include <stdio.h> /* standard I/O */
#include <stdlib.h> /* for exit - 1 occurrence */
#include <math.h>
#include <rtems.h>
#include <sys/timespec.h>
#include <sys/unistd.h>
#else
#include "../common_sys_header.inc"
#endif
#define counter_t long unsigned int
//#define putchar m_putchar
// extern FILE *stduart;
// extern FILE *dbgstdout;
void report_perf(int sz, int b, const char *pre, counter_t t, counter_t i) {
printf("%s %d %u %lu %lu\n", pre, b, sz, t, i);
}
#ifndef TABLE_SIZE
#define TABLE_SIZE 10
#endif
#include "fft.h"
#include "fft_impl4.c"
__attribute__((always_inline))
unsigned delta(counter_t end, counter_t start) {
return (start <= end) ? (end - start) : ((0xffffffffu-end) + start + 1);
}
#ifndef BLOCKSIZES
#define BLOCKSIZES 1,2,4,8,16,30
#endif
static const int blocksizes[] = { BLOCKSIZES };
#define NBLSZ (sizeof(blocksizes)/sizeof(blocksizes[0]))
#if USE_RTEMS
#define dtime clock
#else
unsigned dtime(void)
{
return get_usec();
}
#endif
#define Y_FFT_ARRAY_SIZE ((1<<TABLE_SIZE)) // * sizeof(cpx_t))
cpx_t y_fft[Y_FFT_ARRAY_SIZE];
#if USE_RTEMS
rtems_task Init(
rtems_task_argument ignored
)
#else
int main(void)
#endif
{
#ifndef LSZ
#define LSZ 1
#endif
counter_t c1, c2;
counter_t i1, i2;
// cpx_t *y_fft = (cpx_t*)calloc((1<<TABLE_SIZE), sizeof(cpx_t));
// dbgstdout = stduart;
system_init();
#if defined (PACKED)
printf("FFT - SINGLE PRECISION, PACKED\n");
#else
printf("FFT - SINGLE PRECISION, NORMAL\n");
#endif
unsigned i;
printf("Columns:\nLN2(SZ)\tINSNS");
for (i = 0; i < NBLSZ; i++)
printf("\tCC_BLK%d", blocksizes[i]);
printf("\n");
//#ifdef NOIOTIME
sysregs_init();
//#endif
unsigned M;
for (M = 1; M <= TABLE_SIZE; M++) {
unsigned N = 1 << M;
printf("%u ", N);
for (i = 0; i < NBLSZ; i++) {
unsigned blocksize = blocksizes[i];
//#ifdef NOIOTIME
//sysregs_start();
//#endif
c1 = dtime(); //mtperf_sample1(MTPERF_CLOCKS);
// i1 = mtperf_sample1(MTPERF_EXECUTED_INSNS);
FFT_1(M, y_fft, N/2, sc_table_ptr, blocksize);
c2 = dtime(); //mtperf_sample1(MTPERF_CLOCKS);
// i2 = mtperf_sample1(MTPERF_EXECUTED_INSNS);
//#ifdef NOIOTIME
//sysregs_stop();
//#endif
if (i == 0) {
// printf("\t%u", delta(i2, i1));
printf("\t");
}
printf("\t%u", delta(c2, c1));
}
printf("\n");
}
printf("Done!\n");
//#ifndef NOHWCONF
do {
unsigned ticks_lo, ticks_hi, insns_lo, insns_hi, fpop_lo, fpop_hi, fpld_lo, fpld_hi, fpst_lo, fpst_hi;
sysregs_stop();
sysregs_read_ext(&ticks_lo, &ticks_hi, &insns_lo, &insns_hi, &fpop_lo, &fpop_hi, &fpld_lo, &fpld_hi, &fpst_lo, &fpst_hi);
if (ticks_hi)
printf("TEST EXECUTED IN (%u * 2^32 + %u) TICKS AND (%u * 2^32 + %u) INSTRUCTIONS, FPU: FPOP (%u * 2^32 + %u) FPLD (%u * 2^32 + %u) FPST (%u * 2^32 + %u) \n\n", ticks_hi, ticks_lo, insns_hi, insns_lo, fpop_hi, fpop_lo, fpld_hi, fpld_lo, fpst_hi, fpst_lo);
else
printf("TEST EXECUTED IN %u TICKS AND %u INSTRUCTIONS, FPU: FPOP %u FPLD %u FPST %u \n\n", ticks_lo, insns_lo, fpop_lo, fpld_lo, fpst_lo);
} while(0);
//#endif /* NOHWCONF */
printf("\n\nEND OF TEST\n\n");
system_done();
return 0;
}
#if USE_RTEMS
/* configuration information */
#include <bsp.h>
/* NOTICE: the clock driver is explicitly disabled */
#define CONFIGURE_APPLICATION_NEEDS_CLOCK_DRIVER
#define CONFIGURE_APPLICATION_NEEDS_CONSOLE_DRIVER
#if 0
#define CONFIGURE_USE_DEVFS_AS_BASE_FILESYSTEM
#endif
#define CONFIGURE_RTEMS_INIT_TASKS_TABLE
#define CONFIGURE_MAXIMUM_TASKS 1
#define CONFIGURE_INIT
#define CONFIGURE_INIT_TASK_ATTRIBUTES RTEMS_FLOATING_POINT
#include <rtems/confdefs.h>
#endif
//
// fft.h: this file is part of the SL program suite.
//
// Copyright (C) 2009 The SL project.
//
// This program is free software; you can redistribute it and/or
// modify it under the terms of the GNU General Public License
// as published by the Free Software Foundation; either version 3
// of the License, or (at your option) any later version.
//
// The complete GNU General Public Licence Notice can be found as the
// `COPYING' file in the root directory.
//
#ifndef SL_BENCHMARKS_FFT_FFT_H
# define SL_BENCHMARKS_FFT_FFT_H
// #include "benchmark.h"
// #define FT long
#define FT float
#if (defined(PACKED) && (!defined(GCC)))
typedef float cpx_t __attribute__((ext_vector_type(2)));
#else
typedef struct { FT x; FT y; } cpx_t;
#endif
// typedef union fp64 {
// /*
// struct {
// float up;
// float dn;
// } stf;
// */
// cpx_t ps;
// double d;
// float f[2];
// unsigned u[2];
// } fp64;
#if defined(PACKED)
#if defined(GCC)
#define opcode_FADDPS(A, B, RES) \
__asm__ __volatile__ ( \
"fadd.ps\t%0,%1,%2\n" \
: "=f" (RES) \
: "If" (A), "If" (B) \
) \
#define opcode_FSUBPS(A, B, RES) \
__asm__ __volatile__ ( \
"fsub.ps\t%0,%1,%2\n" \
: "=f" (RES) \
: "If" (A), "If" (B) \
) \
#define opcode_FMULPS(A, B, RES) \
__asm__ __volatile__ ( \
"fmul.ps\t%0,%1,%2\n" \
: "=f" (RES) \
: "If" (A), "If" (B) \
) \
#endif
#define opcode_FADDRPS(A, B, RES) \
__asm__ __volatile__ ( \
"faddr.ps\t%0,%1,%2\n" \
: "=f" (RES) \
: "f" (A), "f" (B) \
) \
#define opcode_FSUBRPS(A, B, RES) \
__asm__ __volatile__ ( \
"fsubr.ps\t%0,%1,%2\n" \
: "=f" (RES) \
: "f" (A), "f" (B) \
) \
#define opcode_FMULXPS(A, B, RES) \
__asm__ __volatile__ ( \
"fmulx.ps\t%0,%1,%2\n" \
: "=f" (RES) \
: "f" (A), "f" (B) \
) \
#define opcode_FSUBADDRPS(A, B, RES) \
__asm__ __volatile__ ( \
"fsubaddr.ps\t%0,%1,%2\n" \
: "=f" (RES) \
: "If" (A), "If" (B) \
) \
#define opcode_FMULRPS(A, B, RES) \
__asm__ __volatile__ ( \
"fmulr.ps\t%0,%1,%2\n" \
: "=f" (RES) \
: "f" (A), "f" (B) \
) \
#endif
#if defined(PACKED)
#if defined(GCC)
#define dcadd(a,b,z) \
opcode_FADDPS(*((double*)&a),*((double*)&b),*((double*)&z));
#define dcsub(a,b,z) \
opcode_FSUBPS(*((double*)&a),*((double*)&b),*((double*)&z));
// 1. C = FMULP(A,B) = (ar*br, ai*bi)
// 2. D = FMULXP(A,B) = (ar*bi, ai*br)
// 3. H = FSUBADDRP(C,D) = (ar*br-ai*bi, ar*bi+ai*br)
#define dcmul(a,b,z) \
do { \
cpx_t tc,td; \
opcode_FMULPS(*((double*)&a),*((double*)&b),*((double*)&tc)); \
opcode_FMULXPS(*((double*)&a),*((double*)&b),*((double*)&td)); \
opcode_FSUBADDRPS(*((double*)&tc),*((double*)&td),*((double*)&z)); \
} while (0)
#define dcabss(a,z) \
do { \
cpx_t tc, td, th; \
th.x=FZERO; th.y=FZERO; \
opcode_FMULPS(*((double*)&a),*((double*)&a),*((double*)&tc)); \
opcode_FADDRPS(*((double*)&tc),*((double*)&th),*((double*)&td)); \
z=td.x; \
} while (0)
#else
#define dcadd(a,b,z) \
z=a+b;
#define dcsub(a,b,z) \
z=a-b;
// 1. C = FMULP(A,B) = (ar*br, ai*bi)
// 2. D = FMULXP(A,B) = (ar*bi, ai*br)
// 3. H = FSUBADDRP(C,D) = (ar*br-ai*bi, ar*bi+ai*br)
#define dcmul(a,b,z) \
do { \
cpx_t tc,td; \
tc=a*b; \
opcode_FMULXPS(a,b,td); \
opcode_FSUBADDRPS(tc,td,z); \
} while (0)
#define dcabss(a,z) \
do { \
cpx_t tc, td, th; \
th.x=FZERO; th.y=FZERO; \
tc=a*a; \
opcode_FADDRPS(tc,th,td); \
z=td.x; \
} while (0)
#endif
#else
#define dcadd(a,b,z) \
z.x=a.x+b.x; \
z.y=a.y+b.y;
#define dcsub(a,b,z) \
z.x=a.x-b.x; \
z.y=a.y-b.y;
#define dcmul(a,b,z) \
do { \
cpx_t th; \
th.x=a.x*b.x-a.y*b.y; \
th.y=a.y*b.x+a.x*b.y; \
z=th; \
} while (0)
#define dcabss(a,z) \
z=a.x*a.x+a.y*a.y;
#endif
/* low-level FFT (for benchmarks) */
static
void FFT_1(unsigned long M, cpx_t*restrict, unsigned long, const void*, unsigned);
#define STRINGIFY_(N) # N
#define STRINGIFY(N) STRINGIFY_(N)
#define MAKENAME_(N, SZ) tables/fft_table ## N ## _ ## SZ ## _data.h
#define MAKENAME(N, SZ) MAKENAME_(N, SZ)
#endif // ! SL_BENCHMARKS_FFT_FFT_H
//
// fft_impl4.c: this file is part of the SL program suite.
//
// Copyright (C) 2009 The SL project.
// Copyright (C) 2020 daiteq.
//
//
enum { MAX_M = TABLE_SIZE, MAX_N = 1 << MAX_M };
static const cpx_t sc_table[ MAX_N ] = {
#define HEADERNAME MAKENAME(2, TABLE_SIZE)
#define HEADER STRINGIFY(HEADERNAME)
#include HEADER
};
const void* sc_table_ptr = sc_table;
void FFT_2(unsigned p, unsigned long LE2, const cpx_t* restrict cos_sin, cpx_t* restrict X, unsigned long Z)
{
unsigned i = p;
const unsigned long w = i & ((LE2) - 1);
const unsigned long j = (i - w) * 2 + w;
const unsigned long ip = j + (LE2);
cpx_t* restrict x = (X);
const cpx_t U = (cos_sin)[w * (Z)];
cpx_t T; // const cpx_t
dcmul(U,x[ip], T);
const cpx_t xj = x[j];
dcsub(xj, T, x[ip]);
dcadd(xj, T, x[j]);
}
void FFT_1_mt(unsigned p, cpx_t* restrict X, unsigned long N2, const void* t)
{
int i;
unsigned k = p+1;
const cpx_t*restrict cos_sin = (const cpx_t*restrict)(const void*)(t);
unsigned long Z = (MAX_N >> k);
unsigned long LE = (1 << k);
for (i=0;i<N2;i++)
FFT_2(i, LE/2, cos_sin, X, Z);
}
static
void FFT_1(unsigned long M, cpx_t*restrict X, unsigned long N2, const void* t, unsigned blocksize)
{
int i;
for (i=0;i<M;i++) {
// printf("M %lu,i %u\n",M,i);
FFT_1_mt(i, X, N2, t);
}
}
#!/bin/sh
bc -l <<EOF
pi = 4 * a(1)
max_m = $TABLE_SIZE
max_n = 2 ^ max_m
for(i = 0; i < max_n; i++)
{
print " {", c(2 * i * pi / max_n), ",", -s(2 * i * pi / max_n), "},\n"
}
EOF
This diff could not be displayed because it is too large.
This diff could not be displayed because it is too large.
# Makefile for hello project
BIN=hello
# C sources
SOURCES=hello.c
HEADERS=../common_sys_header.inc
# libraries in required order (compiled in the order)
LIBS=c m c bcc built
#include <stdio.h>
int main(void)
{
printf("* Hello World\n\n");
return 0;
}
# Makefile for linpack project
# one of DP/SP/HP should be defined
# one of ROLL/UNROLL should be defined
BIN=linpack
# C sources
SOURCES=linpack.c
# Header and additional files
HEADERS=../common_sys_header.inc
CFLAGS=-ggdb
# libraries in required order (compiled in the order)
LIBS=c m c bcc built
# Makefile for mandelph project
BIN=mandelph
# C sources
SOURCES=mandelph.c
HEADERS=../common_sys_header.inc
CFLAGS=-ggdb
# libraries in required order (compiled in the order)
LIBS=c m c bcc built
/* -----------------------------------------------------------------------------
* Copyright (C) 2019-2021 daiteq s.r.o. http://www.daiteq.com
*
* This program is distributed WITHOUT ANY WARRANTY; without even
* the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
* PURPOSE.
*
* -----------------------------------------------------------------------------
* Filename : mandelph.c
* Authors : Martin Danek
* Description : Mandelbrot set with packed half-precision ops
* Release :
* Version : 1.5
* Date : 27.4.2021
* -----------------------------------------------------------------------------
*/
#include <stdint.h>
#include "../common_sys_header.inc"
#define NOIOTIME
#if defined(PACKED)
typedef half cpxh_t __attribute__((ext_vector_type(2)));
#else
typedef struct cpxh_t {
half x;
half y;
} cpxh_t;
#endif
// typedef union fp32 {
// /*
// struct {
// half up;
// half dn;
// } sth;
// */
// half h[2];
// cpxh_t ph;
// float f;
// unsigned u;
// } fp32;
#if defined(PACKED)
#define opcode_FADDRPH(A, B, RES) \
__asm__ __volatile__ ( \
"faddr.ph\t%0, %1, %2\n" \
: "=f" (RES) \
: "If" (A), "If" (B) \
)
#define opcode_FMULXPH(A, B, RES) \
__asm__ __volatile__ ( \
"fmulx.ph\t%0, %1, %2\n" \
: "=f" (RES) \
: "If" (A), "If" (B) \
)
#define opcode_FSUBADDRPH(A, B, RES) \
__asm__ __volatile__ ( \
"fsubaddr.ph\t%0, %1, %2\n" \
: "=f" (RES) \
: "If" (A), "If" (B) \
)
#define opcode_FSUBRPH(A, B, RES) \
__asm__ __volatile__ ( \
"fsubr.ph\t%0, %1, %2\n" \
: "=f" (RES) \
: "If" (A), "If" (B) \
)
#define opcode_FMOVHZU(A, RES) \
__asm__ __volatile__ ( \
"fmvhzu.ph\t%0, %1\n" \
: "=f" (RES) \
: "f" (A) \
)
#define opcode_FMULRPH(A, B, RES) \
__asm__ __volatile__ ( \
"fmulr.ph\t%0, %1, %2\n" \
: "=f" (RES) \
: "If" (A), "If" (B) \
)
#endif
#if defined(PACKED)
#define dcadd(a,b,z) \
z=a+b;
#define dcsub(a,b,z) \
z=a-b;
// 1. C = FMULP(A,B) = (ar*br, ai*bi)
// 2. D = FMULXP(A,B) = (ar*bi, ai*br)
// 3. H = FSUBADDRP(C,D) = (ar*br-ai*bi, ar*bi+ai*br)
#define dcmul(a,b,z) \
do { \
cpxh_t tc, td; \
tc=a*b; \
opcode_FMULXPH(a,b,td); \
opcode_FSUBADDRPH(tc,td,z); \
} while (0)
#define dcabss(a,z) \
do { \
cpxh_t th={0.h,0.h}; \
cpxh_t tc, td; \
tc=a*a; \
opcode_FADDRPH(tc,th,td); \
z=td.x; \
} while (0)
#else
#define dcadd(a,b,z) \
do { \
z.x=a.x+b.x; \
z.y=a.y+b.y; \
} while (0)
#define dcsub(a,b,z) \
do { \
z.x=a.x-b.x; \
z.y=a.y-b.y; \
} while (0)
#define dcmul(a,b,z) \
do { \
cpxh_t th; \
th.x=a.x*b.x-a.y*b.y; \
th.y=a.y*b.x+a.x*b.y; \
z=th; \
} while (0)
#define dcabss(a,z) \
do { \
z=a.x*a.x+a.y*a.y; \
} while (0)
#endif
#if defined(FCALL)
cpxh_t cadd(cpxh_t a, cpxh_t b) {
cpxh_t c;
#if defined(PACKED)
c=a+b;
#else
c.x=a.x+b.x;
c.y=a.y+b.y;
#endif
return c;
}
cpxh_t csub(cpxh_t a, cpxh_t b) {
cpxh_t c;
#if defined(PACKED)
c=a-b;
#else
c.x=a.x-b.x;
c.y=a.y-b.y;
#endif
return c;
}
cpxh_t cmul(cpxh_t a, cpxh_t b) {
// 1. C = FMULP(A,B) = (ar*br, ai*bi)
// 2. D = FMULXP(A,B) = (ar*bi, ai*br)
// 3. H = FSUBADDRP(C,D) = (ar*br-ai*bi, ar*bi+ai*br)
// W/O FSUBADDRP
// 3. E = FADDRP(D,0) = (ar*bi+ai*br, 0)
// 4. F = FSUBRP(C,0) = (ar*br-ai*bi, 0)
// 5. G = FMOVHZU(E) = (0, ar*bi+ai*br)
// 6. H = FADDP(F,G) = (ar*br-ai*bi, ar*bi+ai*br)
cpxh_t x,y,c,d,e,f,g,h;
const cpxh_t zero={0.H,0.H};
#if defined(PACKED)
x=a;
y=b;
c=x*y;
opcode_FMULXPH(x,y,d);
opcode_FSUBADDRPH(c,d,h);
// W/O FSUBADDRP
// opcode_FADDRPH(d,zero,e);
// opcode_FSUBRPH(c,zero,f);
// opcode_FMOVHZU(e,g);
// h.ph=f.ph+g.ph;
#else
h.x=a.x*b.x-a.y*b.y;
h.y=a.y*b.x+a.x*b.y;
#endif
return h;
}
half cabss(cpxh_t a) {
cpxh_t x,c,d;
const cpxh_t zero={0.h,0.h};
#if defined(PACKED)
c=a*a;
opcode_FADDRPH(c,zero,d);
#else
d.x=a.x*a.x+a.y*a.y;
#endif
return d.x;
}
#endif
#define RES_X 80
#define RES_Y 80
#define MAX_ITER 100
#define MAX_COL 80
#define MAX_ROW 80
#define printpacked(A) \
c=A.x; \
d=A.y; \
printf(" (%f,%f)",c,d);
int main(void) {
half a,b,c,d;
cpxh_t pa,pb,pc,point,next;
int i, j;
unsigned k,l,m,hit;
system_init();
#if defined (PACKED)
printf("MANDEL - HALF PRECISION, PACKED\n");
#else
printf("MANDEL - HALF PRECISION, NORMAL\n");
#endif
// #define INIT_TEST
#if defined (INIT_TEST)
a = 3.0h;
b = 4.0h;
pa.x = a;
pa.y = b;
printf("A=(%f,%f)\n",pa.x,pa.y);
pb.x = 0.h;
pb.y = 1.h;
printf("B=(%f,%f)\n",pb.x,pb.y);
#if defined(FCALL)
pc=cadd(pa,pb);
#else
dcadd(pa,pb,pc);
#endif
printf("pa+pb "); printpacked(pc); NL;
#if defined(FCALL)
pc=csub(pa,pb);
#else
dcsub(pa,pb,pc);
#endif
printf("pa-pb "); printpacked(pc); NL;
#if defined(FCALL)
pc=cmul(pa,pb);
#else
dcmul(pa,pb,pc);
#endif
printf("pa*pb "); printpacked(pc); NL;
#endif
#ifdef NOIOTIME
sysregs_init();
#endif
for (j=-RES_Y;j<RES_Y;j++) {
for (i=-2*RES_X;i<RES_X;i++) {
point.x=((half)i)/(((half)RES_X)*1.h);point.y=((half)j)/(((half)RES_X)*1.h);
k=0;
next.x=0.h;
next.y=0.h;
#if defined(FCALL)
while ((k<MAX_ITER)&&(cabss(next)<64000.h)) {
#else
dcabss(next,a);
while ((k<MAX_ITER)&&(a<64000.h)) {
#endif
#if defined(FCALL)
next=cadd(cmul(next,next),point);
#else
dcmul(next,next,pa);
dcadd(pa,point,next);
dcabss(next,a);
#endif
k++;
}
if (k==MAX_ITER) {
#ifdef NOIOTIME
sysregs_stop();
#endif
printf("*");
#ifdef NOIOTIME
sysregs_start();
#endif
}
else {
if (k<10) {
#ifdef NOIOTIME
sysregs_stop();
#endif
printf("%c",'0'+k);
#ifdef NOIOTIME
sysregs_start();
#endif
}
else {
#ifdef NOIOTIME
sysregs_stop();
#endif
printf("%c",'A'+((k>35)?35:k)-10);
#ifdef NOIOTIME
sysregs_start();
#endif
}
}
}
NL;
}
NL;
do {
unsigned ticks_lo, ticks_hi, insns_lo, insns_hi, fpop_lo, fpop_hi, fpld_lo, fpld_hi, fpst_lo, fpst_hi;
sysregs_stop();
sysregs_read_ext(&ticks_lo, &ticks_hi, &insns_lo, &insns_hi, &fpop_lo, &fpop_hi, &fpld_lo, &fpld_hi, &fpst_lo, &fpst_hi);
if (ticks_hi)
printf("TEST EXECUTED IN (%u * 2^32 + %u) TICKS AND (%u * 2^32 + %u) INSTRUCTIONS, FPU: FPOP (%u * 2^32 + %u) FPLD (%u * 2^32 + %u) FPST (%u * 2^32 + %u) \n\n", ticks_hi, ticks_lo, insns_hi, insns_lo, fpop_hi, fpop_lo, fpld_hi, fpld_lo, fpst_hi, fpst_lo);
else
printf("TEST EXECUTED IN %u TICKS AND %u INSTRUCTIONS, FPU: FPOP %u FPLD %u FPST %u \n\n", ticks_lo, insns_lo, fpop_lo, fpld_lo, fpst_lo);
} while(0);
printf("\n\nEND OF TEST\n\n");
system_done();
return 0;
}
# Makefile for mandelps project
BIN=mandelps
# C sources
SOURCES=mandelps.c
HEADERS=../common_sys_header.inc
CFLAGS=-ggdb
# libraries in required order (compiled in the order)
LIBS=c m c bcc built
/* -----------------------------------------------------------------------------
* Copyright (C) 2019-2021 daiteq s.r.o. http://www.daiteq.com
*
* This program is distributed WITHOUT ANY WARRANTY; without even
* the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
* PURPOSE.
*
* -----------------------------------------------------------------------------
* Filename : mandelps.c
* Authors : Martin Danek
* Description : Mandelbrot set with packed single-precision ops
* Release :
* Version : 1.5
* Date : 27.4.2021
* -----------------------------------------------------------------------------
*/
#include <stdint.h>
#include "../common_sys_header.inc"
#define NOIOTIME
#if (defined(PACKED) && (!defined(GCC)))
typedef float cpx_t __attribute__((ext_vector_type(2)));
#else
typedef struct cpx_t {
float x;
float y;
} cpx_t;
#endif
#if (!defined(GCC))
#define FZERO 0.F
#define FONE 1.F
#define FTHREE 3.F
#define FFOUR 4.F
#define FMAX 1.e+35F
#else
#define FZERO ((float)0.)
#define FONE ((float)1.)
#define FTHREE ((float)3.)
#define FFOUR ((float)4.)
#define FMAX ((float)1.e+35)
#endif
// typedef union fp64 {
// /*
// struct {
// float up;
// float dn;
// } stf;
// */
// cpx_t ps;
// double d;
// float f[2];
// unsigned u[2];
// } fp64;
#if defined(PACKED)
#if defined(GCC)
#define opcode_FADDPS(A, B, RES) \
__asm__ __volatile__ ( \
"fadd.ps\t%0, %1, %2\n" \
: "=f" (RES) \
: "If" (A), "If" (B) \
) \
#define opcode_FSUBPS(A, B, RES) \
__asm__ __volatile__ ( \
"fsub.ps\t%0, %1, %2\n" \
: "=f" (RES) \
: "If" (A), "If" (B) \
) \
#define opcode_FMULPS(A, B, RES) \
__asm__ __volatile__ ( \
"fmul.ps\t%0, %1, %2\n" \
: "=f" (RES) \
: "If" (A), "If" (B) \
) \
#endif
#define opcode_FADDRPS(A, B, RES) \
__asm__ __volatile__ ( \
"faddr.ps\t%0, %1, %2\n" \
: "=f" (RES) \
: "If" (A), "If" (B) \
) \
#define opcode_FSUBRPS(A, B, RES) \
__asm__ __volatile__ ( \
"fsubr.ps\t%0, %1, %2\n" \
: "=f" (RES) \
: "If" (A), "If" (B) \
) \
#define opcode_FMULXPS(A, B, RES) \
__asm__ __volatile__ ( \
"fmulx.ps\t%0, %1, %2\n" \
: "=f" (RES) \
: "If" (A), "If" (B) \
) \
#define opcode_FSUBADDRPS(A, B, RES) \
__asm__ __volatile__ ( \
"fsubaddr.ps\t%0, %1, %2\n" \
: "=f" (RES) \
: "If" (A), "If" (B) \
) \
#define opcode_FMULRPS(A, B, RES) \
__asm__ __volatile__ ( \
"fmulr.ps\t%0, %1, %2\n" \
: "=f" (RES) \
: "If" (A), "If" (B) \
) \
#endif
#if defined(PACKED)
#if defined(GCC)
#define dcadd(a,b,z) \
opcode_FADDPS(*((double*)&a),*((double*)&b),*((double*)&z));
#define dcsub(a,b,z) \
opcode_FSUBPS(*((double*)&a),*((double*)&b),*((double*)&z));
// 1. C = FMULP(A,B) = (ar*br, ai*bi)
// 2. D = FMULXP(A,B) = (ar*bi, ai*br)
// 3. H = FSUBADDRP(C,D) = (ar*br-ai*bi, ar*bi+ai*br)
#define dcmul(a,b,z) \
do { \
cpx_t tc,td; \
opcode_FMULPS(*((double*)&a),*((double*)&b),*((double*)&tc)); \
opcode_FMULXPS(*((double*)&a),*((double*)&b),*((double*)&td)); \
opcode_FSUBADDRPS(*((double*)&tc),*((double*)&td),*((double*)&z)); \
} while (0)
#define dcabss(a,z) \
do { \
cpx_t tc, td, th; \
th.x=FZERO; th.y=FZERO; \
opcode_FMULPS(*((double*)&a),*((double*)&a),*((double*)&tc)); \
opcode_FADDRPS(*((double*)&tc),*((double*)&th),*((double*)&td)); \
z=td.x; \
} while (0)
#else
#define dcadd(a,b,z) \
z=a+b;
#define dcsub(a,b,z) \
z=a-b;
// 1. C = FMULP(A,B) = (ar*br, ai*bi)
// 2. D = FMULXP(A,B) = (ar*bi, ai*br)
// 3. H = FSUBADDRP(C,D) = (ar*br-ai*bi, ar*bi+ai*br)
#define dcmul(a,b,z) \
do { \
cpx_t tc,td; \
tc=a*b; \
opcode_FMULXPS(a,b,td); \
opcode_FSUBADDRPS(tc,td,z); \
} while (0)
#define dcabss(a,z) \
do { \
cpx_t tc, td, th; \
th.x=FZERO; th.y=FZERO; \
tc=a*a; \
opcode_FADDRPS(tc,th,td); \
z=td.x; \
} while (0)
#endif
#else
#define dcadd(a,b,z) \
z.x=a.x+b.x; \
z.y=a.y+b.y;
#define dcsub(a,b,z) \
z.x=a.x-b.x; \
z.y=a.y-b.y;
#define dcmul(a,b,z) \
do { \
cpx_t th; \
th.x=a.x*b.x-a.y*b.y; \
th.y=a.y*b.x+a.x*b.y; \
z=th; \
} while (0)
#define dcabss(a,z) \
z=a.x*a.x+a.y*a.y;
#endif
#if defined(FCALL)
cpx_t cadd(cpx_t a, cpx_t b) {
cpx_t c;
#if defined(PACKED)
c=a+b;
#else
c.x=a.x+b.x;
c.y=a.y+b.y;
#endif
return c;
}
cpx_t csub(cpx_t a, cpx_t b) {
cpx_t c;
#if defined(PACKED)
c=a-b;
#else
c.x=a.x-b.x;
c.y=a.y-b.y;
#endif
return c;
}
cpx_t cmul(cpx_t a, cpx_t b) {
// 1. C = FMULP(A,B) = (ar*br, ai*bi)
// 2. D = FMULXP(A,B) = (ar*bi, ai*br)
// 3. H = FSUBADDRP(C,D) = (ar*br-ai*bi, ar*bi+ai*br)
//
// W/O FSUBADDRP:
// 3. E = FADDRP(D,0) = (ar*bi+ai*br, 0)
// 4. F = FSUBRP(C,0) = (ar*br-ai*bi, 0)
// 5. G = FMOVHZU(E) = (0, ar*bi+ai*br)
// 6. H = FADDP(F,G) = (ar*br-ai*bi, ar*bi+ai*br)
//
cpx_t x,y,c,d,e,f,g,h,zero;
#if defined(PACKED)
// zero.d=0.;
x=a;
y=b;
c=x*y;
opcode_FMULXPS(x,y,d);
opcode_FSUBADDRPS(c,d,h);
// W/O FSUBADDRP
// opcode_FADDRPS(d,zero,e);
// opcode_FSUBRPS(c,zero,f);
// g.x=0;
// g.y=e.x;
// h=f+g;
#else
h.x=a.x*b.x-a.y*b.y;
h.y=a.y*b.x+a.x*b.y;
#endif
return h;
}
float cabss(cpx_t a) {
cpx_t x,c,d;
const cpx_t zero={FZERO,FZERO};
#if defined(PACKED)
c=a*a;
opcode_FADDRPS(c,zero,d);
#else
d.x=a.x*a.x+a.y*a.y;
#endif
return d.x;
}
#endif
#define RES_X 80
#define RES_Y 80
#define MAX_ITER 100
#define MAX_COL 80
#define MAX_ROW 80
#define printpacked(A) \
c=A.x; \
d=A.y; \
printf(" (%f,%f)",c,d);
int main(void) {
float a,b,c,d;
cpx_t pa,pb,pc,point,next;
int i, j;
unsigned k,l,m,hit;
system_init();
#if defined (PACKED)
printf("MANDELREF - SINGLE PRECISION, PACKED\n");
#else
printf("MANDELREF - SINGLE PRECISION, NORMAL\n");
#endif
// #define INIT_TEST
#if defined (INIT_TEST)
a = FTHREE;
b = FFOUR;
pa.x = a;
pa.y = b;
printf("A=(%f,%f)\n",pa.x,pa.y);
pb.x = FZERO;
pb.y = FONE;
printf("B=(%f,%f)\n",pb.x,pb.y);
#if defined(FCALL)
pc=cadd(pa,pb);
#else
dcadd(pa,pb,pc);
#endif
printf("pa+pb (%f,%f)\n",pc.x,pc.y);
#if defined(FCALL)
pc=csub(pa,pb);
#else
dcsub(pa,pb,pc);
#endif
printf("pa-pb (%f,%f)\n",pc.x,pc.y);
#if defined(FCALL)
pc=cmul(pa,pb);
#else
dcmul(pa,pb,pc);
#endif
printf("paxpb (%f,%f)\n",pc.x,pc.y);
#endif
#ifdef NOIOTIME
sysregs_init();
#endif
for (j=-RES_Y;j<RES_Y;j++) {
for (i=-2*RES_X;i<RES_X;i++) {
point.x=((float)i)/(((float)RES_X));point.y=((float)j)/(((float)RES_Y));
// point.x=i/(RES_X*1.);point.y=j/(RES_X*1.);
k=0;
next.x=FZERO;
next.y=FZERO;
#if defined(FCALL)
while ((k<MAX_ITER)&&(cabss(next)<FMAX)) {
#else
dcabss(next,a);
while ((k<MAX_ITER)&&(a<FMAX)) {
#endif
#if defined(FCALL)
next=cadd(cmul(next,next),point);
#else
dcmul(next,next,pa);
dcadd(pa,point,next);
dcabss(next,a);
#endif
k++;
}
if (k==MAX_ITER) {
#ifdef NOIOTIME
sysregs_stop();
#endif
printf("*");
#ifdef NOIOTIME
sysregs_start();
#endif
}
else {
if (k<10) {
#ifdef NOIOTIME
sysregs_stop();
#endif
printf("%c",'0'+k);
#ifdef NOIOTIME
sysregs_start();
#endif
}
else {
#ifdef NOIOTIME
sysregs_stop();
#endif
printf("%c",'A'+((k>35)?35:k)-10);
#ifdef NOIOTIME
sysregs_start();
#endif
}
}
}
NL;
}
NL;
do {
unsigned ticks_lo, ticks_hi, insns_lo, insns_hi, fpop_lo, fpop_hi, fpld_lo, fpld_hi, fpst_lo, fpst_hi;
sysregs_stop();
sysregs_read_ext(&ticks_lo, &ticks_hi, &insns_lo, &insns_hi, &fpop_lo, &fpop_hi, &fpld_lo, &fpld_hi, &fpst_lo, &fpst_hi);
if (ticks_hi)
printf("TEST EXECUTED IN (%u * 2^32 + %u) TICKS AND (%u * 2^32 + %u) INSTRUCTIONS, FPU: FPOP (%u * 2^32 + %u) FPLD (%u * 2^32 + %u) FPST (%u * 2^32 + %u) \n\n", ticks_hi, ticks_lo, insns_hi, insns_lo, fpop_hi, fpop_lo, fpld_hi, fpld_lo, fpst_hi, fpst_lo);
else
printf("TEST EXECUTED IN %u TICKS AND %u INSTRUCTIONS, FPU: FPOP %u FPLD %u FPST %u \n\n", ticks_lo, insns_lo, fpop_lo, fpld_lo, fpst_lo);
} while(0);
printf("\n\nEND OF TEST\n\n");
system_done();
return 0;
}
# Makefile for paranoia project
BIN=paranoia
# C sources
SOURCES=paranoia.c
# header and additional files
HEADERS=../common_sys_header.inc
CFLAGS=
# libraries in required order (compiled in the order)
LIBS=sf built m bcc c bcc built
# Makefile for primenums project
BIN=primenums
# C sources
SOURCES=primenums.c ise.c
HEADERS=../common_sys_header.inc
CFLAGS=-ggdb
# required libraries in correct order
LIBS=c m c bcc built
#define op_swar(A,B,RES) \
asm volatile ( \
"swar %0,%1,%2\n" \
: "=r"(RES) \
: "r"(A), "r"(B) \
)
#define op_set_ctrl(A) \
asm volatile ( \
"csrw swarctrlstat, %0\n" \
: : "r"(A) \
)
#define op_get_stat(RES) \
asm volatile ( \
"csrr %0, swarctrlstat\n" \
: "=r"(RES) \
)
#define op_get_acc(A,RES) \
asm volatile ( \
"csrw swaracc, %0\n" \
"csrr %1, swaracc\n" \
: "=r"(RES) : "r"(A) \
)
#ifdef DIRECT_ACCUM
#define op_get_acc(RES) \
asm volatile ( \
"csrr %0, swaracc\n" \
: "=r"(RES) \
)
#define op_get_acchi(RES) \
asm volatile ( \
"csrr %0, swaracchi\n" \
: "=r"(RES) \
)
#endif
/* -----------------------------------------------------------------------------
* Copyright (C) 2019 daiteq s.r.o. http://www.daiteq.com
*
* This program is distributed WITHOUT ANY WARRANTY; without even
* the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
* PURPOSE.
*
* -----------------------------------------------------------------------------
* Filename : ise.c
* Authors : Martin Danek
* Description : models of the ISE for LEON2
* Release :
* Version : 1.0
* Date : 20.04.2019
* -----------------------------------------------------------------------------
*/
#include <stdint.h>
#include <stdio.h>
#include "ise.h"
#define BALANCE
// #define DBGPRINT
REGTYPE swar_dem(REGTYPE exp, REGTYPE signal, unsigned packingFactor, unsigned bits, unsigned interleaved, unsigned cplx) {
unsigned i;
unsigned mask;
#ifndef BALANCE
unsigned ar, ai, br, bi;
REGTYPE re, im;
REGTYPE res;
#else
int ar, ai, br, bi;
SREGTYPE re, im;
SREGTYPE res;
#endif
unsigned split;
mask = (1 << bits) - 1;
split = (packingFactor / 2) * bits;
re=0;
im=0;
res=0;
#ifdef DBGPRINT
printf("exp %016lX signal %016lX\n", exp, signal);
#endif /* DBGPRINT */
for (i=0;i<packingFactor;i+=2) { // 2 values per one complex number
ai = ((exp) >> bits*i) & mask;
ar = ((exp) >> (bits*(i+1))) & mask;
bi = ((signal) >> bits*i) & mask;
br = ((signal) >> (bits*(i+1))) & mask;
#ifdef BALANCE
if (bits==1) {
ai=(ai?ai:-1);
ar=(ar?ar:-1);
bi=(bi?bi:-1);
br=(br?br:-1);
}
else {
ai = ai - (1<<(bits-1));
ar = ar - (1<<(bits-1));
bi = bi - (1<<(bits-1));
br = br - (1<<(bits-1));
}
#endif
if (cplx) {
// complex multiplication
re = mask & (((ar * br) >> bits) + ((ai * bi) >> bits)) >> 1;
#ifndef BALANCE
im = mask & ((((ar * bi) >> bits) - ((ai * br) >> bits)) >> 1) + ((1<<bits)-1);
#else
im = mask & ((((ar * bi) >> bits) - ((ai * br) >> bits)) >> 1);
#endif
}
else {
// real multiplication
re = mask & ((ar * br) >> bits);
im = mask & ((ai * bi) >> bits);
}
if (interleaved) {
// interleaved store
res |= (re << (bits*(i+1)));
res |= (im << (bits*i));
}
else {
// consecutive store
res |= (re << (split + (bits*(i>>1))));
res |= (im << (bits*(i>>1)));
}
#ifdef DBGPRINT
printf("%5d e_r %3d e_i %3d s_r %3d s_i %3d re %3ld im %3ld res %016lX\n",i, ar, ai, br, bi, re, im, res);
#endif /* DBGPRINT */
}
return res;
}
SREGTYPE swar_corr(REGTYPE code, REGTYPE signal, unsigned packingFactor, unsigned bits, unsigned sgnd, unsigned reduce, SREGTYPE *sop) {
unsigned i, mask;
int a, c;
int b;
int res;
mask = (1 << bits) - 1;
res=0;
for (i=0;i<packingFactor;i++) {
a = ((code) >> i) & 1;
if (sgnd) a = (a==0)?-1:a;
b = ((signal) >> (bits * i)) & mask;
c = a * b;
res += c;
#ifdef DBGPRINT
printf("%06d cd % 2d sg %3d c % 3d res %10d\n", i, a, b, c, res);
#endif /* DBGPRINT */
}
if (reduce)
*sop += res;
else
*sop = res;
return *sop;
}
#ifdef RV32
#define NPI 0x80000000
#define NPI_HALF 0x40000000
#define SINE_THR_N_1 85669753 // 0.125327831168065
#define SINE_THR_N_2 172723448 // 0.252680255142079
#define SINE_THR_N_3 262760287 // 0.384396774495639
#define SINE_THR_N_4 357913941 // 0.523598775598299
#define SINE_THR_N_5 461496472 // 0.675131532937032
#define SINE_THR_N_6 579705789 // 0.848062078981481
#define SINE_THR_N_7 728294928 // 1.06543581651074
#define SINE_THR_N_8 NPI_HALF // 1.5707963267949
#define COSINE_THR_N_1 345446896 // 0.505360510284157
#define COSINE_THR_N_2 494036035 // 0.722734247813416
#define COSINE_THR_N_3 612245352 // 0.895664793857865
#define COSINE_THR_N_4 715827883 // 1.0471975511966
#define COSINE_THR_N_5 810981537 // 1.18639955229926
#define COSINE_THR_N_6 901018376 // 1.31811607165282
#define COSINE_THR_N_7 988072071 // 1.44546849562683
#define COSINE_THR_N_8 NPI_HALF // 1.5707963267949
#else
#define NPI 0x8000000000000000
#define NPI_HALF 0x4000000000000000
#define SINE_THR_N_1 0x051B37797325C5C0 // 0.125327831168065 0.079786175349536
#define SINE_THR_N_2 0x0A4B8CF83D29BB80 // 0.252680255142079 0.160861246510332
#define SINE_THR_N_3 0x0FA9675F16BBCA80 // 0.384396774495639 0.244714587078246
#define SINE_THR_N_4 0x1555555555555600 // 0.523598775598299 0.333333333333333
#define SINE_THR_N_5 0x1B81E0985CC8EB00 // 0.675131532937032 0.429802082816549
#define SINE_THR_N_6 0x228D9BBCB992E400 // 0.848062078981481 0.539893087674768
#define SINE_THR_N_7 0x2B68E60F85AC8A00 // 1.065435816510739 0.678277506979335
#define SINE_THR_N_8 NPI_HALF // 1.570796326794897 1.000000000000000
#define COSINE_THR_N_1 0x149719F07A537700 // 0.505360510284157 0.321722493020665
#define COSINE_THR_N_2 0x1D726443466D1E00 // 0.722734247813416 0.460106912325232
#define COSINE_THR_N_3 0x247E1F67A3371600 // 0.895664793857865 0.570197917183451
#define COSINE_THR_N_4 0x2AAAAAAAAAAAAC00 // 1.047197551196598 0.666666666666667
#define COSINE_THR_N_5 0x305698A0E9443800 // 1.186399552299258 0.755285412921754
#define COSINE_THR_N_6 0x35B47307C2D64600 // 1.318116071652818 0.839138753489668
#define COSINE_THR_N_7 0x3AE4C8868CDA3C00 // 1.445468495626831 0.920213824650464
#define COSINE_THR_N_8 NPI_HALF // 1.570796326794897 1.000000000000000
#endif
// Compute sine(arg) for <0;2*pi> mapped to <0;0xffffffff>
REGTYPE sineQuantNorm(unsigned bits, REGTYPE arg) {
unsigned pos, val;
pos=1;
// printf("arg1 %f 2*PI %f\n",arg, 2*PI);
// printf("arg2 %f\n",arg);
if (arg>NPI) {
// arg=arg-NPI;
arg&=(~NPI);
pos=0;
}
if (arg>(NPI_HALF)) {
// arg=(NPI)-arg;
arg=((~arg)+1)&(~NPI);
}
// printf("arg3 %f\n",arg);
if (bits==1) {
val=pos;
}
if (bits==2) {
if (arg<SINE_THR_N_4) val=2;
else val=3;
if (pos==0) val=3-val;
}
if (bits==3) {
if (arg<SINE_THR_N_2) val=4;
else if (arg<SINE_THR_N_4) val=5;
else if (arg<SINE_THR_N_6) val=6;
else val=7;
if (pos==0) val=7-val;
}
if (bits==4) {
if (arg<SINE_THR_N_1) val=8;
else if (arg<SINE_THR_N_2) val=9;
else if (arg<SINE_THR_N_3) val=10;
else if (arg<SINE_THR_N_4) val=11;
else if (arg<SINE_THR_N_5) val=12;
else if (arg<SINE_THR_N_6) val=13;
else if (arg<SINE_THR_N_7) val=14;
else val=15;
if (pos==0) val=15-val;
}
return val;
}
// Compute cosine(arg) for <0;2*pi> mapped to <0;0xffffffff>
REGTYPE cosineQuantNorm(unsigned bits, REGTYPE arg) {
unsigned pos, val;
pos=1;
// printf("arg1 %f 2*PI %f\n",arg, 2*PI);
// printf("arg2 %f\n",arg);
if (arg>NPI) {
// arg=2*NPI-arg;
arg=(~arg)+1;
}
if (arg>(NPI_HALF)) {
// arg=(NPI)-arg;
arg=((~arg)+1)&(~NPI);
pos=0;
}
// printf("arg3 %f\n",arg);
if (bits==1) {
val=pos;
}
if (bits==2) {
if (arg<COSINE_THR_N_4) val=3;
else val=2;
if (pos==0) val=3-val;
}
if (bits==3) {
if (arg<COSINE_THR_N_2) val=7;
else if (arg<COSINE_THR_N_4) val=6;
else if (arg<COSINE_THR_N_6) val=5;
else val=4;
if (pos==0) val=7-val;
}
if (bits==4) {
if (arg<COSINE_THR_N_1) val=15;
else if (arg<COSINE_THR_N_2) val=14;
else if (arg<COSINE_THR_N_3) val=13;
else if (arg<COSINE_THR_N_4) val=12;
else if (arg<COSINE_THR_N_5) val=11;
else if (arg<COSINE_THR_N_7) val=10;
else if (arg<COSINE_THR_N_6) val=9;
else val=8;
if (pos==0) val=15-val;
}
return val;
}
REGTYPE swar_sincos(REGTYPE coef, unsigned bits) {
REGTYPE sin1, cos1;
REGTYPE res;
cos1 = (cosineQuantNorm(bits, coef));
sin1 = (sineQuantNorm(bits, coef));
#ifdef DBGPRINT
printf("%08lX c1 %1lX s1 %1lX\n", coef, cos1, sin1);
#endif /* DBGPRINT */
switch (bits) {
case 1:
res = (cos1 << 1) | sin1;
break;
case 2:
res = (cos1 << 2) | sin1;
break;
case 3:
res = (cos1 << 3) | sin1;
break;
case 4:
res = (cos1 << 4) | sin1;
break;
default:
printf("SWAR_SINCOS: INVALID BIT COUNT\n");
break;
}
#ifdef DBGPRINT
printf("sincos: %016lX\n", res);
#endif /* DBGPRINT */
return res;
}
REGTYPE swar_alu(REGTYPE veca, REGTYPE vecb, unsigned packingFactor, unsigned bits, unsigned oper, unsigned sgnd, unsigned sat, unsigned reduce, SREGTYPE *acc) {
unsigned i;
REGTYPE mask;
REGTYPE a, b;
REGTYPE c, res;
unsigned neg;
mask = (1 << bits) - 1;
c=0;
res=0;
neg=0;
#ifdef DBGPRINT
printf("veca %016lX vecb %016lX packing %d bits %d oper %d sgnd %d sat %d red %d\n", veca, vecb, packingFactor, bits, oper, sgnd, sat, reduce);
#endif /* DBGPRINT */
for (i=0;i<packingFactor;i++) {
a = ((veca) >> bits*i) & mask;
b = ((vecb) >> bits*i) & mask;
if (sgnd) {
if (a & (1<<(bits-1))) a |= ~(mask);
if (b & (1<<(bits-1))) b |= ~(mask);
}
switch (oper) {
case ADD: c = a + b; break;
case SUB: c = a - b; neg=(a<b)?1:0; break;
case MUL: c = a * b; break;
default: printf("SWAR_ALU: INVALID OPERATION\n"); break;
}
if (sat==SATUR) {
if (!sgnd) {
if (neg) c=0;
else {
if (c>((1<<bits)-1)) c = (1<<bits)-1;
}
}
else {
if ((SREGTYPE)c>(SREGTYPE)((1<<(bits-1))-1)) c = (1<<(bits-1))-1;
else if ((SREGTYPE)c<(SREGTYPE)(-(1<<(bits-1)))) c = (-(1<<(bits-1)));
}
}
// else {
// if ((!sgnd) && (neg)) c = c | (~mask);
// }
#ifdef DBGPRINT
printf("i %d sgnd %d sat %d a %016lX b %016lX c %016lX neg %d\n", i, sgnd, sat, a, b, c, neg);
#endif
if (reduce) res += c;
else res |= (c & mask) << (bits*i);
// if ((!sgnd) && (bits<16)) c &= (1 << (bits<<1)) - 1;
*(acc+i) += c;
#ifdef DBGPRINT
printf("c %016lX acc[%d] = %016lX\n", c, i, *(acc+i));
#endif
#ifdef DBGPRINT
printf("%5d a %3ld b %3ld c %3ld res %016lX\n",i, a, b, c, res); // acc(%08X) , acc[i]
#endif /* DBGPRINT */
}
return res;
}
/* -----------------------------------------------------------------------------
* Copyright (C) 2019 daiteq s.r.o. http://www.daiteq.com
*
* This program is distributed WITHOUT ANY WARRANTY; without even
* the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
* PURPOSE.
*
* -----------------------------------------------------------------------------
* Filename : ise.h
* Authors : Martin Danek
* Description : models of the ISE for LEON2
* Release :
* Version : 1.0
* Date : 20.04.2019
* -----------------------------------------------------------------------------
*/
#ifndef ISE_H
#define ISE_H
/* default architecture is 64bit */
#ifdef RV32
typedef uint32_t REGTYPE;
typedef int32_t SREGTYPE;
#define REGBITS 32
#define REGFMT "0x%08X"
#else
typedef uint64_t REGTYPE;
typedef int64_t SREGTYPE;
#define REGBITS 64
#define REGFMT "0x%016lX"
#endif
#define ADD 0x0
#define SUB 0x8
#define MUL 0xC
#define NOSAT 0
#define SATUR 1
REGTYPE swar_dem(REGTYPE exp, REGTYPE signal, unsigned packingFactor, unsigned bits, unsigned interleaved, unsigned cplx);
SREGTYPE swar_corr(REGTYPE code, REGTYPE signal, unsigned packingFactor, unsigned bits, unsigned sgnd, unsigned reduce, SREGTYPE *sop);
REGTYPE swar_sincos(REGTYPE coef, unsigned bits);
REGTYPE swar_alu(REGTYPE veca, REGTYPE vecb, unsigned packingFactor, unsigned bits, unsigned oper, unsigned sgnd, unsigned sat, unsigned reduce, SREGTYPE *acc);
#endif /* ISE_H */
#ifndef SWAR_HEADER_FILE
#define SWAR_HEADER_FILE
/* SWAR configuration register */
#define SWAR_CONF_ACCSIZE_MASK 0x3F
#define SWAR_CONF_ACCSIZE_SHIFT 26
#define SWAR_CONF_SWIDTH_MASK 0x1F
#define SWAR_CONF_SWIDTH_SHIFT 21
#define SWAR_CONF_LANES_MASK 0x1F
#define SWAR_CONF_LANES_SHIFT 16
#define SWAR_CONF_FCN_CORREL 0
#define SWAR_CONF_FCN_DEMOD 1
#define SWAR_CONF_FCN_SINCOS 2
#define SWAR_CONF_FCN_AUDIO 3
#define SWAR_CONF_FCN_VIDEO 4
#define SWAR_CONF_FCN_ALU 5
#define SWAR_CONF_FCN_LACCUMS 6
/* SWAR control register */
/* audio/video/alu operations */
#define SW_OP_ADD 0x00
#define SW_OP_SUB 0x08
#define SW_OP_MUL 0x0C
/* correlation */
#define SW_OP_COR1b 0x04
#define SW_OP_COR2b 0x05
#define SW_OP_COR3b 0x06
#define SW_OP_COR4b 0x07
/* demodulation */
#define SW_OP_DEMR2b 0x09
#define SW_OP_DEMR3b 0x0A
#define SW_OP_DEMR4b 0x0B
#define SW_OP_DEMC2b 0x0D
#define SW_OP_DEMC3b 0x0E
#define SW_OP_DEMC4b 0x0F
#define SW_OP_DEMC2bG 0x01
#define SW_OP_DEMC3bG 0x02
#define SW_OP_DEMC4bG 0x03
/* sincos LUT */
#define SW_OP_SC1b 0x10
#define SW_OP_SC2b 0x20
#define SW_OP_SC3b 0x30
#define SW_OP_SC4b 0x40
#define SW_CTRL_OPMASK 0xFF
#define SW_CTRL_SIGNED (1<<8)
#define SW_CTRL_REDUCE (1<<9)
#define SW_CTRL_SATURATE (1<<10)
#define SW_CTRL_NORMALIZE (1<<11)
#define SW_CTRL_AUDIO (1<<12)
#define SW_CTRL_VIDEO (1<<13)
#define SW_CTRL_ALU (1<<14)
#endif /* SWAR_HEADER_FILE */
../scripts
\ No newline at end of file
# Makefile for Standford test
BIN=sford
# C sources
SOURCES=stanford.c
HEADERS=../common_sys_header.inc
CFLAGS=-ggdb
# libraries in required order (compiled in the order)
LIBS=c m c bcc built
# Makefile for swarfir project
BIN=swarfir
# C sources
SOURCES=swarfir.c
HEADERS=../common_sys_header.inc
# required libraries in correct order
LIBS=c m c bcc built
CFLAGS=
% Signal generation
sample_rate = 48000;
nsamples = 256;
F = [1 15] * 1000;
A = [1 0.5];
% Time vector - use colon operator to generate integer vector of sample numbers
t = (0:nsamples-1) / sample_rate;
% Test signal - use matrix notation to compose it with single expression
signal = A * sin(2*pi*F'*t);
% FIR coefficient generation
% Choose filter cutoff frequency (6 kHz)
cutoff_hz = 6000;
% Normalize cutoff frequency (wrt Nyquist frequency)
nyq_freq = sample_rate / 2;
cutoff_norm = cutoff_hz / nyq_freq;
% FIR filter order (i.e. number of coefficients - 1)
order = 24;
% Create lowpass FIR filter through a direct approach
% NOTE: fir1, firpmord and firpm all require Signal Processing Toolbox
fir_coeff = fir1(order, cutoff_norm);
% Analyse the filter using the Filter Visualization Tool
%fvtool(fir_coeff, 'Fs', sample_rate)
% Filter the signal with the FIR filter
filtered_signal = filter(fir_coeff, 1, signal);
% Convert to 8-bit integer version
conv_scale = 92
sig8b = int8(signal*conv_scale);
printf("signal = [");
printf("%d,",sig8b);
printf("];\n");
fir8b = int8(fir_coeff*conv_scale);
printf("fir_coeff = [");
printf("%d,",fir8b);
printf("];\n");
filtsig8b = filter(fir8b, 1, sig8b);
printf("filt_signal = [");
printf("%d,",filtsig8b);
printf("];\n");
% filtsig8b is multiplied by conv_scale^2
/* -----------------------------------------------------------------------------
* Copyright (C) 2019 daiteq s.r.o. http://www.daiteq.com
*
* This program is distributed WITHOUT ANY WARRANTY; without even
* the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
* PURPOSE.
*
* -----------------------------------------------------------------------------
* Filename : swarconf.h
* Authors : Martin Danek
* Description : configuration of SWAR instructions
* Release :
* Version : 1.0
* Date : 9.4.2019
* -----------------------------------------------------------------------------
*/
#ifndef SWAR_CONFIGURATION_WORD_H
#define SWAR_CONFIGURATION_WORD_H
// SWAR configuration register definitions
#define SW_OP_ADD 0x0
#define SW_OP_SUB 0x8
#define SW_OP_MUL 0xc
//
#define SW_OP_COR1b 0x4
#define SW_OP_COR2b 0x5
#define SW_OP_COR3b 0x6
#define SW_OP_COR4b 0x7
//
#define SW_OP_DEMR2b 0x9
#define SW_OP_DEMR3b 0xa
#define SW_OP_DEMR4b 0xb
//
#define SW_OP_DEMC2b 0xd
#define SW_OP_DEMC3b 0xe
#define SW_OP_DEMC4b 0xf
//
#define SW_OP_DEMC2bG 0x1
#define SW_OP_DEMC3bG 0x2
#define SW_OP_DEMC4bG 0x3
//
#define SW_OP_SC1b 0x10
#define SW_OP_SC2b 0x20
#define SW_OP_SC3b 0x30
#define SW_OP_SC4b 0x40
//
#define SW_SGND 0x100
#define SW_RED 0x200
#define SW_SAT 0x400
#define SW_NORM 0x800
//
#define SW_AUDIO 0x1000
#define SW_VIDEO 0x2000
#define SW_ALU 0x4000
#endif /* SWAR_CONFIGURATION_WORD_H */
# Description of format is in the script 'test.sh'
# Shortly, each line for processing contains three parameters at least
# <TESTNAME> <ID> <CONFIG> <OPTIONAL_CFLAGS>
# use LLVM for compilation
! llvm
# use default linker script with 1MB RAM
# use the following options 'march' and 'mabi' for compilation
> rv64g lp64d
^ -DRAMSIZE=0x100000 -DCRAM=0x00000000 -DSYSCLK_PERIOD=100000000
### Hello - code and data in 64kB RAM
#hello 1 none
hello 1 daifpu_dual_dpsp_divsqrt
> rv64ima_zfh lp64
### Test half - code and data in 64kB RAM
testhalf 2 daifpu_hp_divsqrt
### Linpack
^ -O2 -DOPT=2 -DROLL -DRAMSIZE=0x100000 -DCRAM=0x00000000 -DSYSCLK_PERIOD=100000000
@ O2
> rv64imafd lp64d
linpack 10 daifpu_dual_dpsp_divsqrt -DDP
linpack 11 daifpu_dual_dpsp_divsqrt -DSP
linpack 12 daifpu_dual_dpsp_divsqrt -DHP
> rv64imaf_zfh lp64f
linpack 13 daifpu_dual_sphp_divonly -DDP
linpack 14 daifpu_dual_sphp_divonly -DSP
linpack 15 daifpu_dual_sphp_divonly -DHP
^ -O2 -DOPT=2 -DUNROLL -DRAMSIZE=100000 -DCRAM=0x00000000 -DSYSCLK_PERIOD=100000000
@ O2
> rv64imafd lp64d
linpack 16 daifpu_dual_dpsp_divsqrt -DDP
linpack 17 daifpu_dual_dpsp_divsqrt -DSP
linpack 18 daifpu_dual_dpsp_divsqrt -DHP
> rv64ima lp64
linpack 19 none -DDP
linpack 20 none -DSP
linpack 21 none -DHP
> rv64imafd lp64d
### Paranoia
^ -O1 -DCRAM=0x00000000 -DRAMSIZE=0x100000
@ O1
paranoia 30 daifpu_dual_dpsp_divsqrt -DDOUBLE_PRECISION
paranoia 31 daifpu_dual_dpsp_divsqrt -DSINGLE_PRECISION
paranoia 32 daifpu_dual_dpsp_divsqrt -DHALF_PRECISION
### Stanford
^ -O2 -DRAMSIZE=0x100000 -DCRAM=0x00000000 -DSYSCLK_PERIOD=100000000
@ O2
sford 40 daifpu_dp_divsqrt -DDP
sford 41 daifpu_dp_divsqrt -DSP
sford 42 daifpu_dp_divsqrt -DHP
> rv64imaf_zfh lp64f
sford 40 daifpu_dual_sphp_none -DDP
sford 41 daifpu_dual_sphp_none -DSP
sford 42 daifpu_dual_sphp_none -DHP
> rv64imafd lp64d
### Whetstone
^ -O2 -DRAMSIZE=0x100000 -DCRAM=0x00000000 -DSYSCLK_PERIOD=100000000
@ O2
wstone 50 daifpu_dual_dpsp_divsqrt -DDP
wstone 51 daifpu_dual_dpsp_divsqrt -DSP
wstone 52 daifpu_dual_dpsp_divsqrt -DHP
wstone 53 daifpu_dp_divonly -DDP
wstone 54 daifpu_dp_divonly -DSP
wstone 55 daifpu_dp_divonly -DHP
> rv64imafd_zfh_x-fph_x-fps lp64
### Mandel
^ -O0 -DRAMSIZE=262144 -DCRAM=0x40000000 -DSYSCLK_PERIOD=100000000
@ O2
mandelph 60 daifpu_hp_divsqrt
mandelph 61 daifpu_php_divsqrt
mandelph 62 daifpu_php_divsqrt -DPACKED
mandelph 63 daifpu_dual_dpsp_divsqrt
mandelph 64 none
mandelps 65 daifpu_dual_dpsp_divsqrt
mandelps 66 daifpu_sp_divsqrt
mandelps 67 daifpu_psp_divsqrt
mandelps 68 daifpu_psp_divsqrt -DPACKED
mandelps 69 daifpu_php_divsqrt
mandelps 70 none
### FFT
#$ link256kB
^ -O3 -DRAMSIZE=262144 -DCRAM=0x40000000 -DSYSCLK_PERIOD=100000000
@ O3
~ TABLE_SIZE=10
fft 80 daifpu_hp_divsqrt
fft 81 daifpu_php_divsqrt
fft 82 daifpu_psp_divsqrt -DPACKED
fft 83 daifpu_dual_dpsp_divsqrt
fft 84 none
fft 85 daifpu_psp_divsqrt
~ TABLE_SIZE=12
fft 87 daifpu_psp_divsqrt -DPACKED -DTABLE_SIZE=12
fft 88 daifpu_dual_dpsp_divsqrt -DTABLE_SIZE=12
fft 89 daifpu_psp_divsqrt -DTABLE_SIZE=12
#~ TABLE_SIZE=14
#fft 90 daifpu_psp_divsqrt -DPACKED -DTABLE_SIZE=14
#fft 91 daifpu_dual_dpsp_divsqrt -DTABLE_SIZE=14
#fft 92 daifpu_psp_divsqrt -DTABLE_SIZE=14
# Description of format is in the script 'test.sh'
# Shortly, each line for processing contains three parameters at least
# <TESTNAME> <ID> <CONFIG> <OPTIONAL_CFLAGS>
# use makefile for LLVM toolchain
! llvm
# use default linker script with 1MB RAM
# use the following options 'march' and 'mabi' for compilation
> rv64g_x-swar lp64
### test SWAR
@ O2
^ -O2 -DOPT=2 -DROLL -DRAMSIZE=0x100000 -DCRAM=0x00000000 -DSYSCLK_PERIOD=100000000 -daiteq-swar-enable -DSWAR_ALU -DSWAR_UNIT_TYPE=1 -DMAX_ACC=4
testswar 100 swar_sincos1234_audio16_video4_acc4x22
^ -O2 -DOPT=2 -DROLL -DRAMSIZE=0x100000 -DCRAM=0x00000000 -DSYSCLK_PERIOD=100000000 -daiteq-swar-enable -DSWAR_ALU -DSWAR_UNIT_TYPE=2 -DMAX_ACC=4
testswar 101 swar_sincos1234_audio16_video4_acc4x22
# Tests operations on SWAR partitionings 32x1b, 16x2b, 10x3b and 8x4b. For LEON2FT SWAR configuration 16x2b just operations for the
# 16x2b partitioning complete without errors, while 32x1b, 10x3b ad 8x4b complete with errors. This is all right.
# When LEON2FT SWAR is configured to other SWAR partitiongs, e.g. 10x3b, then the respective operations will complete without errors, and
# others with errors.
^ -O2 -DOPT=2 -DROLL -DRAMSIZE=0x100000 -DCRAM=0x00000000 -DSYSCLK_PERIOD=100000000 -daiteq-swar-enable -DSWAR_ALU -DSWAR_UNIT_TYPE=4 -DMAX_ACC=4
testswar 102 swar_sincos2_alu16x2_acc16x22
### primenums
@ O0
^ -O0 -DRAMSIZE=0x100000 -DCRAM=0x00000000 -DSYSCLK_PERIOD=100000000 -daiteq-swar-enable -DSLICESZ=8
primenums 110 none
### FIR
@ O0
^ -O0 -DRAMSIZE=0x100000 -DCRAM=0x00000000 -DSYSCLK_PERIOD=100000000 -daiteq-swar-enable
swarfir 120 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_DATA_LENGTH=10 -DCONFIG_FIR_LENGTH=5
swarfir 121 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_DATA_LENGTH=50 -DCONFIG_FIR_LENGTH=5
swarfir 122 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_DATA_LENGTH=100 -DCONFIG_FIR_LENGTH=5
swarfir 123 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_DATA_LENGTH=500 -DCONFIG_FIR_LENGTH=5
swarfir 124 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_DATA_LENGTH=1000 -DCONFIG_FIR_LENGTH=5
swarfir 125 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_DATA_LENGTH=5000 -DCONFIG_FIR_LENGTH=5
swarfir 130 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_DATA_LENGTH=10 -DCONFIG_FIR_LENGTH=10
swarfir 131 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_DATA_LENGTH=50 -DCONFIG_FIR_LENGTH=10
swarfir 132 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_DATA_LENGTH=100 -DCONFIG_FIR_LENGTH=10
swarfir 133 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_DATA_LENGTH=500 -DCONFIG_FIR_LENGTH=10
swarfir 134 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_DATA_LENGTH=1000 -DCONFIG_FIR_LENGTH=10
swarfir 135 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_DATA_LENGTH=5000 -DCONFIG_FIR_LENGTH=10
swarfir 140 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_ONLINE_FIRLEN=4 -DCONFIG_DATA_LENGTH=500
swarfir 141 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_ONLINE_FIRLEN=12 -DCONFIG_DATA_LENGTH=500
swarfir 142 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_ONLINE_FIRLEN=20 -DCONFIG_DATA_LENGTH=500
swarfir 143 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_ONLINE_FIRLEN=25 -DCONFIG_DATA_LENGTH=500
swarfir 144 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_ONLINE_FIRLEN=30 -DCONFIG_DATA_LENGTH=500
swarfir 145 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_ONLINE_FIRLEN=40 -DCONFIG_DATA_LENGTH=500
swarfir 146 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_ONLINE_FIRLEN=50 -DCONFIG_DATA_LENGTH=500
@ O2
^ -O2 -DRAMSIZE=0x100000 -DCRAM=0x00000000 -DSYSCLK_PERIOD=100000000 -daiteq-swar-enable
swarfir 150 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_DATA_LENGTH=10 -DCONFIG_FIR_LENGTH=5
swarfir 151 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_DATA_LENGTH=50 -DCONFIG_FIR_LENGTH=5
swarfir 152 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_DATA_LENGTH=100 -DCONFIG_FIR_LENGTH=5
swarfir 153 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_DATA_LENGTH=500 -DCONFIG_FIR_LENGTH=5
swarfir 154 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_DATA_LENGTH=1000 -DCONFIG_FIR_LENGTH=5
swarfir 155 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_DATA_LENGTH=5000 -DCONFIG_FIR_LENGTH=5
swarfir 160 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_DATA_LENGTH=10 -DCONFIG_FIR_LENGTH=10
swarfir 161 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_DATA_LENGTH=50 -DCONFIG_FIR_LENGTH=10
swarfir 162 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_DATA_LENGTH=100 -DCONFIG_FIR_LENGTH=10
swarfir 163 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_DATA_LENGTH=500 -DCONFIG_FIR_LENGTH=10
swarfir 164 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_DATA_LENGTH=1000 -DCONFIG_FIR_LENGTH=10
swarfir 165 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_DATA_LENGTH=5000 -DCONFIG_FIR_LENGTH=10
swarfir 170 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_ONLINE_FIRLEN=4 -DCONFIG_DATA_LENGTH=500
swarfir 171 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_ONLINE_FIRLEN=12 -DCONFIG_DATA_LENGTH=500
swarfir 172 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_ONLINE_FIRLEN=20 -DCONFIG_DATA_LENGTH=500
swarfir 173 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_ONLINE_FIRLEN=25 -DCONFIG_DATA_LENGTH=500
swarfir 174 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_ONLINE_FIRLEN=30 -DCONFIG_DATA_LENGTH=500
swarfir 175 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_ONLINE_FIRLEN=40 -DCONFIG_DATA_LENGTH=500
swarfir 176 swar_sincos1234_audio16_video4_acc4x22 -DCONFIG_ONLINE_FIRLEN=50 -DCONFIG_DATA_LENGTH=500
# Makefile for testhalf project
BIN=testhalf
SOURCES=test.c
LIBS=c m c bcc built
/* -----------------------------------------------------------------------------
* Copyright (C) 2019-2021 daiteq s.r.o. http://www.daiteq.com
*
* This program is distributed WITHOUT ANY WARRANTY; without even
* the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
* PURPOSE.
*
* -----------------------------------------------------------------------------
* Filename : testhalf.c
* Authors : Roman Bartosinski
* Description : Simple test with packed half-precision ops
* Release :
* Version : 1.0
* Date : 27.4.2021
* -----------------------------------------------------------------------------
*/
#include <stdint.h>
#include <stdio.h>
half add(half a, half b)
{
return a+b;
}
half hsqrt(half a)
{
return a*a;
}
int main(void)
{
half a = 3.1415H;
half b = 2.71828H;
volatile half c;
printf("A=%f (x%04X)\n", a, *((uint16_t *)&a));
printf("B=%f (x%04X)\n", b, *((uint16_t *)&b));
c = add(a,b);
printf("a+b=%f (x%04X)\n", c, *((uint16_t *)&c));
printf("a-b=%f\n", a-b);
printf("b-a=%f\n", b-a);
c = hsqrt(a);
printf("A*A=%f (x%04X)\n", c, *((uint16_t *)&c));
return 0;
}
BIN=testswar
SOURCES=testswar.c ise.c
HEADERS=../common_sys_header.inc
LIBS=c m c bcc built
CFLAGS=-ggdb
/* -----------------------------------------------------------------------------
* Copyright (C) 2019 daiteq s.r.o. http://www.daiteq.com
*
* This program is distributed WITHOUT ANY WARRANTY; without even
* the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
* PURPOSE.
*
* -----------------------------------------------------------------------------
* Filename : asm_swar.inc
* Authors : Martin Danek
* Description : inline assembly routines
* Release :
* Version : 1.0
* Date : 9.4.2019
* -----------------------------------------------------------------------------
*/
#define op_swar(A,B,RES) \
asm volatile ( \
"swar %0,%1,%2\n" \
: "=r"(RES) \
: "r"(A), "r"(B) \
)
#define op_set_ctrl(A) \
asm volatile ( \
"csrw swarctrlstat, %0\n" \
: : "r"(A) \
)
#define op_get_stat(RES) \
asm volatile ( \
"csrr %0, swarctrlstat\n" \
: "=r"(RES) \
)
#define op_get_acc(A,RES) \
asm volatile ( \
"csrw swaracc, %1\n" \
"csrr %0, swaracc\n" \
: "=r"(RES) : "r"(A) \
)
#ifdef DIRECT_ACCUM
#define op_get_acc(RES) \
asm volatile ( \
"csrr %0, swaracc\n" \
: "=r"(RES) \
)
#define op_get_acchi(RES) \
asm volatile ( \
"csrr %0, swaracchi\n" \
: "=r"(RES) \
)
#endif
#!/bin/bash
echo; echo "*** testswar"
mkdir -p batch
ACC_NUM=4 make
mv build/testswar.elf batch/testswar.elf
mv build/testswar.srec batch/testswar.srec
make clean
echo; echo "*** testswar-acc-8"
ACC_NUM=8 make
mv build/testswar.elf batch/testswar-acc-8.elf
mv build/testswar.srec batch/testswar-acc-8.srec
make clean
echo; echo "*** testswar-acc-16"
ACC_NUM=16 make
mv build/testswar.elf batch/testswar-acc-16.elf
mv build/testswar.srec batch/testswar-acc-16.srec
make clean
echo; echo "*** testswar-dummy"
CFLAGS=-DSWAR_DUMMY make
mv build/testswar.elf batch/testswar-dummy.elf
mv build/testswar.srec batch/testswar-dummy.srec
make clean
/* SWAR accumulator */
union swaraccum {
struct {
uint32_t lo;
uint32_t hi;
} u32;
uint64_t u64;
} swar_accum[4];
static unsigned swar_ctrl = 0;
#define op_swar(A,B,RES) \
do { \
unsigned b; \
switch(swar_ctrl & 0xFF) { \
case SW_OP_ADD: case SW_OP_SUB: case SW_OP_MUL: \
if (swar_ctrl & SW_CTRL_AUDIO) b = 16; \
else if (swar_ctrl & SW_CTRL_VIDEO) b = 8; \
else b = 4; \
RES = swar_alu(A, B, 32/b, b, swar_ctrl, swar_ctrl & SW_CTRL_SIGNED, \
swar_ctrl & SW_CTRL_SATURATE, \
swar_ctrl & SW_CTRL_REDUCE, (uint32_t *)&swar_accum); \
break; \
case SW_OP_COR1b: case SW_OP_COR2b: case SW_OP_COR3b: case SW_OP_COR4b: \
b = (swar_ctrl & 0xFF)-SW_OP_COR1b+1; \
RES = swar_corr(A, B, 32/b, b, swar_ctrl & SW_CTRL_SIGNED, swar_ctrl & SW_CTRL_REDUCE, (uint32_t *) &swar_accum); \
break; \
} \
} while(0)
/*
#define op_swarcc(A,B,RES) \
asm volatile ( \
"swarcc %0,%1,%2\n" \
: "=r"(RES) \
: "r"(A), "r"(B) \
)
*/
#define op_set_ctrl(A) \
do { \
swar_ctrl = A; \
swar_accum[0].u64 = 0; \
swar_accum[1].u64 = 0; \
swar_accum[2].u64 = 0; \
swar_accum[3].u64 = 0; \
} while(0)
#define op_get_stat(RES) \
RES = swar_ctrl
#define op_get_acc0(RES) \
RES = swar_accum[0].u32.lo
#define op_get_acc0hi(RES) \
RES = swar_accum[0].u32.hi
#define op_get_acc1(RES) \
RES = swar_accum[1].u32.lo
#define op_get_acc1hi(RES) \
RES = swar_accum[1].u32.hi
#define op_get_acc2(RES) \
RES = swar_accum[2].u32.lo
#define op_get_acc2hi(RES) \
RES = swar_accum[2].u32.hi
#define op_get_acc3(RES) \
RES = swar_accum[3].u32.lo
#define op_get_acc3hi(RES) \
RES = swar_accum[3].u32.hi
/* -----------------------------------------------------------------------------
* Copyright (C) 2019 daiteq s.r.o. http://www.daiteq.com
*
* This program is distributed WITHOUT ANY WARRANTY; without even
* the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
* PURPOSE.
*
* -----------------------------------------------------------------------------
* Filename : ise.c
* Authors : Martin Danek
* Description : models of the ISE for LEON2
* Release :
* Version : 1.0
* Date : 20.04.2019
* -----------------------------------------------------------------------------
*/
#include <stdint.h>
#include <stdio.h>
#include "ise.h"
#define BALANCE
// #define DBGPRINT
REGTYPE swar_dem(REGTYPE exp, REGTYPE signal, unsigned packingFactor, unsigned bits, unsigned interleaved, unsigned cplx) {
unsigned i;
unsigned mask;
#ifndef BALANCE
unsigned ar, ai, br, bi;
REGTYPE re, im;
REGTYPE res;
#else
int ar, ai, br, bi;
SREGTYPE re, im;
SREGTYPE res;
#endif
unsigned split;
mask = (1 << bits) - 1;
split = (packingFactor / 2) * bits;
re=0;
im=0;
res=0;
#ifdef DBGPRINT
printf("exp %016lX signal %016lX\n", exp, signal);
#endif /* DBGPRINT */
for (i=0;i<packingFactor;i+=2) { // 2 values per one complex number
ai = ((exp) >> bits*i) & mask;
ar = ((exp) >> (bits*(i+1))) & mask;
bi = ((signal) >> bits*i) & mask;
br = ((signal) >> (bits*(i+1))) & mask;
#ifdef BALANCE
if (bits==1) {
ai=(ai?ai:-1);
ar=(ar?ar:-1);
bi=(bi?bi:-1);
br=(br?br:-1);
}
else {
ai = ai - (1<<(bits-1));
ar = ar - (1<<(bits-1));
bi = bi - (1<<(bits-1));
br = br - (1<<(bits-1));
}
#endif
if (cplx) {
// complex multiplication
re = mask & (((ar * br) >> bits) + ((ai * bi) >> bits)) >> 1;
#ifndef BALANCE
im = mask & ((((ar * bi) >> bits) - ((ai * br) >> bits)) >> 1) + ((1<<bits)-1);
#else
im = mask & ((((ar * bi) >> bits) - ((ai * br) >> bits)) >> 1);
#endif
}
else {
// real multiplication
re = mask & ((ar * br) >> bits);
im = mask & ((ai * bi) >> bits);
}
if (interleaved) {
// interleaved store
res |= (re << (bits*(i+1)));
res |= (im << (bits*i));
}
else {
// consecutive store
res |= (re << (split + (bits*(i>>1))));
res |= (im << (bits*(i>>1)));
}
#ifdef DBGPRINT
printf("%5d e_r %3d e_i %3d s_r %3d s_i %3d re %3ld im %3ld res %016lX\n",i, ar, ai, br, bi, re, im, res);
#endif /* DBGPRINT */
}
return res;
}
SREGTYPE swar_corr(REGTYPE code, REGTYPE signal, unsigned packingFactor, unsigned bits, unsigned sgnd, unsigned reduce, SREGTYPE *sop) {
unsigned i, mask;
int a, c;
int b;
int res;
mask = (1 << bits) - 1;
res=0;
for (i=0;i<packingFactor;i++) {
a = ((code) >> i) & 1;
if (sgnd) a = (a==0)?-1:a;
b = ((signal) >> (bits * i)) & mask;
c = a * b;
res += c;
#ifdef DBGPRINT
printf("%06d cd % 2d sg %3d c % 3d res %10d\n", i, a, b, c, res);
#endif /* DBGPRINT */
}
if (reduce)
*sop += res;
else
*sop = res;
return *sop;
}
#ifdef RV32
#define NPI 0x80000000
#define NPI_HALF 0x40000000
#define SINE_THR_N_1 85669753 // 0.125327831168065
#define SINE_THR_N_2 172723448 // 0.252680255142079
#define SINE_THR_N_3 262760287 // 0.384396774495639
#define SINE_THR_N_4 357913941 // 0.523598775598299
#define SINE_THR_N_5 461496472 // 0.675131532937032
#define SINE_THR_N_6 579705789 // 0.848062078981481
#define SINE_THR_N_7 728294928 // 1.06543581651074
#define SINE_THR_N_8 NPI_HALF // 1.5707963267949
#define COSINE_THR_N_1 345446896 // 0.505360510284157
#define COSINE_THR_N_2 494036035 // 0.722734247813416
#define COSINE_THR_N_3 612245352 // 0.895664793857865
#define COSINE_THR_N_4 715827883 // 1.0471975511966
#define COSINE_THR_N_5 810981537 // 1.18639955229926
#define COSINE_THR_N_6 901018376 // 1.31811607165282
#define COSINE_THR_N_7 988072071 // 1.44546849562683
#define COSINE_THR_N_8 NPI_HALF // 1.5707963267949
#else
#define NPI 0x8000000000000000
#define NPI_HALF 0x4000000000000000
#define SINE_THR_N_1 0x051B37797325C5C0 // 0.125327831168065 0.079786175349536
#define SINE_THR_N_2 0x0A4B8CF83D29BB80 // 0.252680255142079 0.160861246510332
#define SINE_THR_N_3 0x0FA9675F16BBCA80 // 0.384396774495639 0.244714587078246
#define SINE_THR_N_4 0x1555555555555600 // 0.523598775598299 0.333333333333333
#define SINE_THR_N_5 0x1B81E0985CC8EB00 // 0.675131532937032 0.429802082816549
#define SINE_THR_N_6 0x228D9BBCB992E400 // 0.848062078981481 0.539893087674768
#define SINE_THR_N_7 0x2B68E60F85AC8A00 // 1.065435816510739 0.678277506979335
#define SINE_THR_N_8 NPI_HALF // 1.570796326794897 1.000000000000000
#define COSINE_THR_N_1 0x149719F07A537700 // 0.505360510284157 0.321722493020665
#define COSINE_THR_N_2 0x1D726443466D1E00 // 0.722734247813416 0.460106912325232
#define COSINE_THR_N_3 0x247E1F67A3371600 // 0.895664793857865 0.570197917183451
#define COSINE_THR_N_4 0x2AAAAAAAAAAAAC00 // 1.047197551196598 0.666666666666667
#define COSINE_THR_N_5 0x305698A0E9443800 // 1.186399552299258 0.755285412921754
#define COSINE_THR_N_6 0x35B47307C2D64600 // 1.318116071652818 0.839138753489668
#define COSINE_THR_N_7 0x3AE4C8868CDA3C00 // 1.445468495626831 0.920213824650464
#define COSINE_THR_N_8 NPI_HALF // 1.570796326794897 1.000000000000000
#endif
// Compute sine(arg) for <0;2*pi> mapped to <0;0xffffffff>
REGTYPE sineQuantNorm(unsigned bits, REGTYPE arg) {
unsigned pos, val;
pos=1;
// printf("arg1 %f 2*PI %f\n",arg, 2*PI);
// printf("arg2 %f\n",arg);
if (arg>NPI) {
// arg=arg-NPI;
arg&=(~NPI);
pos=0;
}
if (arg>(NPI_HALF)) {
// arg=(NPI)-arg;
arg=((~arg)+1)&(~NPI);
}
// printf("arg3 %f\n",arg);
if (bits==1) {
val=pos;
}
if (bits==2) {
if (arg<SINE_THR_N_4) val=2;
else val=3;
if (pos==0) val=3-val;
}
if (bits==3) {
if (arg<SINE_THR_N_2) val=4;
else if (arg<SINE_THR_N_4) val=5;
else if (arg<SINE_THR_N_6) val=6;
else val=7;
if (pos==0) val=7-val;
}
if (bits==4) {
if (arg<SINE_THR_N_1) val=8;
else if (arg<SINE_THR_N_2) val=9;
else if (arg<SINE_THR_N_3) val=10;
else if (arg<SINE_THR_N_4) val=11;
else if (arg<SINE_THR_N_5) val=12;
else if (arg<SINE_THR_N_6) val=13;
else if (arg<SINE_THR_N_7) val=14;
else val=15;
if (pos==0) val=15-val;
}
return val;
}
// Compute cosine(arg) for <0;2*pi> mapped to <0;0xffffffff>
REGTYPE cosineQuantNorm(unsigned bits, REGTYPE arg) {
unsigned pos, val;
pos=1;
// printf("arg1 %f 2*PI %f\n",arg, 2*PI);
// printf("arg2 %f\n",arg);
if (arg>NPI) {
// arg=2*NPI-arg;
arg=(~arg)+1;
}
if (arg>(NPI_HALF)) {
// arg=(NPI)-arg;
arg=((~arg)+1)&(~NPI);
pos=0;
}
// printf("arg3 %f\n",arg);
if (bits==1) {
val=pos;
}
if (bits==2) {
if (arg<COSINE_THR_N_4) val=3;
else val=2;
if (pos==0) val=3-val;
}
if (bits==3) {
if (arg<COSINE_THR_N_2) val=7;
else if (arg<COSINE_THR_N_4) val=6;
else if (arg<COSINE_THR_N_6) val=5;
else val=4;
if (pos==0) val=7-val;
}
if (bits==4) {
if (arg<COSINE_THR_N_1) val=15;
else if (arg<COSINE_THR_N_2) val=14;
else if (arg<COSINE_THR_N_3) val=13;
else if (arg<COSINE_THR_N_4) val=12;
else if (arg<COSINE_THR_N_5) val=11;
else if (arg<COSINE_THR_N_7) val=10;
else if (arg<COSINE_THR_N_6) val=9;
else val=8;
if (pos==0) val=15-val;
}
return val;
}
REGTYPE swar_sincos(REGTYPE coef, unsigned bits) {
REGTYPE sin1, cos1;
REGTYPE res;
cos1 = (cosineQuantNorm(bits, coef));
sin1 = (sineQuantNorm(bits, coef));
#ifdef DBGPRINT
printf("%08lX c1 %1lX s1 %1lX\n", coef, cos1, sin1);
#endif /* DBGPRINT */
switch (bits) {
case 1:
res = (cos1 << 1) | sin1;
break;
case 2:
res = (cos1 << 2) | sin1;
break;
case 3:
res = (cos1 << 3) | sin1;
break;
case 4:
res = (cos1 << 4) | sin1;
break;
default:
printf("SWAR_SINCOS: INVALID BIT COUNT\n");
break;
}
#ifdef DBGPRINT
printf("sincos: %016lX\n", res);
#endif /* DBGPRINT */
return res;
}
REGTYPE swar_alu(REGTYPE veca, REGTYPE vecb, unsigned packingFactor, unsigned bits, unsigned oper, unsigned sgnd, unsigned sat, unsigned reduce, SREGTYPE *acc) {
unsigned i;
REGTYPE mask;
REGTYPE a, b;
REGTYPE c, res;
unsigned neg;
mask = (1 << bits) - 1;
c=0;
res=0;
neg=0;
#ifdef DBGPRINT
printf("veca %016lX vecb %016lX packing %d bits %d oper %d sgnd %d sat %d red %d\n", veca, vecb, packingFactor, bits, oper, sgnd, sat, reduce);
#endif /* DBGPRINT */
for (i=0;i<packingFactor;i++) {
a = ((veca) >> bits*i) & mask;
b = ((vecb) >> bits*i) & mask;
if (sgnd) {
if (a & (1<<(bits-1))) a |= ~(mask);
if (b & (1<<(bits-1))) b |= ~(mask);
}
switch (oper) {
case ADD: c = a + b; break;
case SUB: c = a - b; neg=(a<b)?1:0; break;
case MUL: c = a * b; break;
default: printf("SWAR_ALU: INVALID OPERATION\n"); break;
}
if (sat==SATUR) {
if (!sgnd) {
if (neg) c=0;
else {
if (c>((1<<bits)-1)) c = (1<<bits)-1;
}
}
else {
if ((SREGTYPE)c>(SREGTYPE)((1<<(bits-1))-1)) c = (1<<(bits-1))-1;
else if ((SREGTYPE)c<(SREGTYPE)(-(1<<(bits-1)))) c = (-(1<<(bits-1)));
}
}
// else {
// if ((!sgnd) && (neg)) c = c | (~mask);
// }
#ifdef DBGPRINT
printf("i %d sgnd %d sat %d a %016lX b %016lX c %016lX neg %d\n", i, sgnd, sat, a, b, c, neg);
#endif
if (reduce) res += c;
else res |= (c & mask) << (bits*i);
// if ((!sgnd) && (bits<16)) c &= (1 << (bits<<1)) - 1;
*(acc+i) += c;
#ifdef DBGPRINT
printf("c %016lX acc[%d] = %016lX\n", c, i, *(acc+i));
#endif
#ifdef DBGPRINT
printf("%5d a %3ld b %3ld c %3ld res %016lX\n",i, a, b, c, res); // acc(%08X) , acc[i]
#endif /* DBGPRINT */
}
return res;
}
/* -----------------------------------------------------------------------------
* Copyright (C) 2019 daiteq s.r.o. http://www.daiteq.com
*
* This program is distributed WITHOUT ANY WARRANTY; without even
* the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
* PURPOSE.
*
* -----------------------------------------------------------------------------
* Filename : ise.h
* Authors : Martin Danek
* Description : models of the ISE for LEON2
* Release :
* Version : 1.0
* Date : 20.04.2019
* -----------------------------------------------------------------------------
*/
#ifndef ISE_H
#define ISE_H
/* default architecture is 64bit */
#ifdef RV32
typedef uint32_t REGTYPE;
typedef int32_t SREGTYPE;
#define REGBITS 32
#define REGFMT "0x%08X"
#else
typedef uint64_t REGTYPE;
typedef int64_t SREGTYPE;
#define REGBITS 64
#define REGFMT "0x%016lX"
#endif
#define ADD 0x0
#define SUB 0x8
#define MUL 0xC
#define NOSAT 0
#define SATUR 1
REGTYPE swar_dem(REGTYPE exp, REGTYPE signal, unsigned packingFactor, unsigned bits, unsigned interleaved, unsigned cplx);
SREGTYPE swar_corr(REGTYPE code, REGTYPE signal, unsigned packingFactor, unsigned bits, unsigned sgnd, unsigned reduce, SREGTYPE *sop);
REGTYPE swar_sincos(REGTYPE coef, unsigned bits);
REGTYPE swar_alu(REGTYPE veca, REGTYPE vecb, unsigned packingFactor, unsigned bits, unsigned oper, unsigned sgnd, unsigned sat, unsigned reduce, SREGTYPE *acc);
#endif /* ISE_H */
/* -----------------------------------------------------------------------------
* Copyright (C) 2019 daiteq s.r.o. http://www.daiteq.com
*
* This program is distributed WITHOUT ANY WARRANTY; without even
* the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
* PURPOSE.
*
* -----------------------------------------------------------------------------
* Filename : swar.h
* Authors : Martin Danek
* Description : configuration of SWAR instructions
* Release :
* Version : 1.0
* Date : 9.4.2019
* -----------------------------------------------------------------------------
*/
#ifndef SWAR_HEADER_FILE
#define SWAR_HEADER_FILE
/* SWAR configuration register */
#define SWAR_CONF_ACCSIZE_MASK 0x3F
#define SWAR_CONF_ACCSIZE_SHIFT 26
#define SWAR_CONF_SWIDTH_MASK 0x1F
#define SWAR_CONF_SWIDTH_SHIFT 21
#define SWAR_CONF_LANES_MASK 0x1F
#define SWAR_CONF_LANES_SHIFT 16
#define SWAR_CONF_FCN_CORREL 0
#define SWAR_CONF_FCN_DEMOD 1
#define SWAR_CONF_FCN_SINCOS 2
#define SWAR_CONF_FCN_AUDIO 3
#define SWAR_CONF_FCN_VIDEO 4
#define SWAR_CONF_FCN_ALU 5
#define SWAR_CONF_FCN_LACCUMS 6
/* SWAR control register */
/* audio/video/alu operations */
#define SW_OP_ADD 0x00
#define SW_OP_SUB 0x08
#define SW_OP_MUL 0x0C
/* correlation */
#define SW_OP_COR1b 0x04
#define SW_OP_COR2b 0x05
#define SW_OP_COR3b 0x06
#define SW_OP_COR4b 0x07
/* demodulation */
#define SW_OP_DEMR2b 0x09
#define SW_OP_DEMR3b 0x0A
#define SW_OP_DEMR4b 0x0B
#define SW_OP_DEMC2b 0x0D
#define SW_OP_DEMC3b 0x0E
#define SW_OP_DEMC4b 0x0F
#define SW_OP_DEMC2bG 0x01
#define SW_OP_DEMC3bG 0x02
#define SW_OP_DEMC4bG 0x03
/* sincos LUT */
#define SW_OP_SC1b 0x10
#define SW_OP_SC2b 0x20
#define SW_OP_SC3b 0x30
#define SW_OP_SC4b 0x40
#define SW_CTRL_OPMASK 0xFF
#define SW_CTRL_SIGNED (1<<8)
#define SW_CTRL_REDUCE (1<<9)
#define SW_CTRL_SATURATE (1<<10)
#define SW_CTRL_NORMALIZE (1<<11)
#define SW_CTRL_AUDIO (1<<12)
#define SW_CTRL_VIDEO (1<<13)
#define SW_CTRL_ALU (1<<14)
#endif /* SWAR_HEADER_FILE */
#define ITYPE_RAW uint64_t
#define STYPE_RAW int64_t
// SWAR operations for data packed in 64bits
typedef struct test_64b_item {
unsigned swop; /* swar operation code */
unsigned dt;
ITYPE_RAW opa;
ITYPE_RAW opb;
ITYPE_RAW reference;
} test_64b_item_t;
const ITYPE_RAW corr[] = {
0x0000000000000007, 0x000000000C000000, 0x00000000FFFFFFFD,
0x000000000000FFFF, 0x0000000030000000, 0x0000000000000003,
0x000000000000FFFF, 0x0000000000F00000, 0x0000000000000006,
0x000000000000FFFC, 0x00000000CCC00000, 0x0000000000000009,
0x0000000000000000, 0x0000000000000000, 0x0000000000000000,
0x0000000000000000, 0x000000000333CFF0, 0x00000000FFFFFFE8,
0x0000000000000003, 0x0000000000000000, 0x0000000000000000,
0x000000000000FFFF, 0x0000000030000000, 0x0000000000000003,
0x000000000000FE00, 0x000000000300C000, 0x0000000000000000,
0x0000000000000001, 0x00000000000000CC, 0x00000000FFFFFFFA,
0x000000000000FFFF, 0x000000003000C00C, 0x0000000000000009,
0x000000000000FF00, 0x000000000CC0CCF0, 0x00000000FFFFFFFA,
0x0000000000000000, 0x00000000033C0300, 0x00000000FFFFFFF4,
0x000000000000FFFF, 0x00000000C0000030, 0x0000000000000006,
0x000000000000FFFF, 0x00000000F0000033, 0x000000000000000C,
0x000000000000FFFF, 0x000000000003C000, 0x0000000000000006,
0x0000000000008000, 0x0000000000C03003, 0x00000000FFFFFFF7,
0x000000000000003F, 0x0000000000003000, 0x00000000FFFFFFFD,
0x000000000000FFFF, 0x0000000000000000, 0x0000000000000000,
0x000000000000C000, 0x00000000300C0F3C, 0x00000000FFFFFFF4,
0x000000000000003F, 0x000000000000C000, 0x00000000FFFFFFFD,
0x000000000000FFFF, 0x00000000003F03C3, 0x0000000000000012,
0x000000000000FFFF, 0x0000000003030000, 0x0000000000000006,
0x000000000000FFFF, 0x00000000030C0FF0, 0x0000000000000012,
0x000000000000FFFF, 0x0000000033CC30CC, 0x0000000000000015,
0x000000000000FFFF, 0x0000000000033C00, 0x0000000000000009,
0x000000000000FFF0, 0x000000003030F000, 0x000000000000000C,
0x0000000000000000, 0x000000000030030F, 0x00000000FFFFFFF4,
0x00000000000007FF, 0x000000000C000C00, 0x0000000000000000,
0x000000000000FFFF, 0x000000000303000C, 0x0000000000000009,
0x000000000000FFFF, 0x00000000003C0000, 0x0000000000000006,
0x000000000000F800, 0x000000000300C30C, 0x00000000FFFFFFFA,
0x0000000000000000, 0x00000000F00F0000, 0x00000000FFFFFFF4,
0x0000000000000000, 0x0000000000033003, 0x00000000FFFFFFF7,
0x0000000000000000, 0x0000000000000000, 0x0000000000000000,
0x0000000000000001, 0x0000000000000000, 0x0000000000000000,
0x000000000000FFFF, 0x000000003CCC0000, 0x000000000000000C,
0x000000000000FE00, 0x0000000003C00C00, 0x0000000000000003,
0x0000000000000000, 0x000000000000C000, 0x00000000FFFFFFFD,
0x0000000000000000, 0x0000000003CC0F00, 0x00000000FFFFFFF1,
0x0000000000000000, 0x00000000C0000000, 0x00000000FFFFFFFD,
0x0000000000000000, 0x000000000FCCC000, 0x00000000FFFFFFF1,
0x0000000000007FFF, 0x00000000C0C00000, 0x0000000000000000,
0x000000000000FFFF, 0x0000000003000000, 0x0000000000000003,
0x000000000000FFFF, 0x000000000C000003, 0x0000000000000006,
0x000000000000FFFF, 0x000000000C00F300, 0x000000000000000C,
0x000000000000FFC0, 0x00000000000000F0, 0x00000000FFFFFFFA,
0x0000000000000000, 0x000000000C00C00C, 0x00000000FFFFFFF7,
0x0000000000003FFF, 0x00000000030CFC00, 0x000000000000000F,
0x000000000000FFE0, 0x00000000000000CC, 0x00000000FFFFFFFA,
0x0000000000000000, 0x00000000C3000C00, 0x00000000FFFFFFF7,
0x0000000000001FFF, 0x00000000000C0CC0, 0x0000000000000009,
0x000000000000FFF0, 0x0000000000000030, 0x00000000FFFFFFFD,
0x0000000000000000, 0x000000000C3300C0, 0x00000000FFFFFFF4,
0x0000000000000FFF, 0x000000000000000C, 0x0000000000000003,
0x000000000000FFF0, 0x00000000C300300C, 0x0000000000000006,
0x0000000000000000, 0x00000000000000FF, 0x00000000FFFFFFF4,
0x0000000000000000, 0x00000000FC00003C, 0x00000000FFFFFFF1
};
const ITYPE_RAW demod[] = {
0x00000000EF7208DF, 0x00000000AEE6AAAA, 0x000000000C000000,
0x000000007208DEB3, 0x00000000AAAEEAEA, 0x0000000000000000,
0x0000000014CEB314, 0x000000009BAAABAA, 0x0000000030003030,
0x00000000CEF7208D, 0x00000000AAAABA6A, 0x00000000000000C0,
0x00000000F7208DFB, 0x00000000EBAABBAA, 0x0000000000F03000,
0x00000000314CEB31, 0x00000000AAAAAA6A, 0x0000000000000000,
0x000000004CEF7208, 0x00000000B96AFA6A, 0x00000000CCC0F0C0,
0x00000000DF7208DF, 0x00000000EAAB9ADA, 0x0000000000000300,
0x00000000B314CEB3, 0x00000000AAA6AEEB, 0x0000000000000003,
0x0000000014CEB720, 0x0000000066AEAAAA, 0x0000000000000000,
0x000000008DF7208D, 0x0000000069A9ABA6, 0x0000000003333030,
0x00000000F7314CEB, 0x000000009ADBBBAA, 0x00000000CFF0C3C0,
0x00000000314CEB72, 0x00000000AAA9AA6A, 0x0000000000000300,
0x0000000008DF7208, 0x00000000AAAAAA6A, 0x0000000000000000,
0x00000000DF7314CE, 0x00000000A9A6AAEA, 0x0000000030003000,
0x00000000B314CEB3, 0x00000000BB6AAAAB, 0x0000000000003003,
0x00000000208DF720, 0x00000000BAABEAA6, 0x000000000300C000,
0x000000008DF7314C, 0x00000000BAAAAAAE, 0x00000000C0000000,
0x00000000EB314CEB, 0x00000000BBA9AAFA, 0x0000000000000000,
0x000000003208DF72, 0x000000006AAA7BD6, 0x0000000000CC0000,
0x0000000008DF7214, 0x00000000ABAA65AA, 0x0000000030000000,
0x00000000CEB314CE, 0x00000000BFAAAABA, 0x00000000C00C0000,
0x00000000B3108DF7, 0x00000000AAEABAAA, 0x000000000CC00000,
0x00000000208DF721, 0x00000000EABA99AA, 0x00000000CCF000C0,
0x000000004CEB314C, 0x00000000AAE96EEA, 0x00000000033C0000,
0x00000000EB3108DF, 0x00000000A6AEA9AA, 0x0000000003000000,
0x000000007208DF72, 0x000000005A9EAAAA, 0x00000000C0000000,
0x0000000004CEB314, 0x0000000069EBB9AA, 0x0000000000300000,
0x00000000CEB3108D, 0x00000000F6AAAAAA, 0x00000000F0000000,
0x00000000F7208DF7, 0x00000000AA6696AF, 0x0000000000330003,
0x00000000204CEB31, 0x00000000A69AAAAE, 0x0000000000030000,
0x000000004CEB3148, 0x00000000BEAAAAAA, 0x00000000C000C000,
0x00000000DF7208DF, 0x00000000AAA6EAAE, 0x0000000000C00000,
0x000000007208CEB3, 0x00000000AE9AAEA9, 0x0000000030030000,
0x0000000014CEB314, 0x00000000A9A9AA6A, 0x0000000000000300,
0x000000008DF7208D, 0x00000000EBAA5AAE, 0x0000000030000000,
0x00000000F7208CEB, 0x00000000AAAAA9AA, 0x0000000000000030,
0x00000000314CEB31, 0x00000000AA5AAEA6, 0x0000000000000000,
0x000000004CDF7208, 0x00000000A5AEA776, 0x00000000300C303C,
0x00000000DF7208CE, 0x00000000AAEEABBA, 0x000000000F3C0000,
0x00000000B314CEB3, 0x00000000AA9AAAAA, 0x0000000000000000,
0x0000000014CDF720, 0x00000000E6E9AAB6, 0x00000000C000030C,
0x000000008DF7208D, 0x00000000AAAABEB7, 0x00000000003F00C0,
0x00000000EB314CEB, 0x00000000AAAEBAA9, 0x0000000003C300C0,
0x00000000314CEF72, 0x00000000AAA6AAAE, 0x0000000003030000,
0x0000000008DF7208, 0x00000000A6AEA9AE, 0x0000000000000000,
0x00000000DEB314CE, 0x00000000AAAE9AFA, 0x00000000030C0000,
0x00000000B314CEF7, 0x000000006AEFF6AA, 0x000000000FF00000,
0x00000000208DF720, 0x00000000AEAB66DA, 0x0000000033CC0000,
0x000000008DFB314C, 0x00000000ABBAEAB9, 0x0000000030CC000F,
0x00000000EB314CEF, 0x00000000AABAAEA9, 0x0000000000030C03,
0x000000007208DF72, 0x00000000AEEAEAAA, 0x000000003C000000,
0x0000000008DFB314, 0x00000000ABAB6EAA, 0x0000000030300000,
0x00000000CEB314CE, 0x00000000B6EAAAAA, 0x00000000F0000000,
0x00000000B7208DF7, 0x00000000ABBAA7A6, 0x0000000000303C00,
0x00000000208DF731, 0x00000000AAE5BA9E, 0x00000000030F0300,
0x000000004CEB314C, 0x00000000AA6AA99A, 0x000000000C000000,
0x00000000EB7208DF, 0x00000000AEEAAAAA, 0x000000000C000000,
0x000000007208DF73, 0x00000000AAABEEA9, 0x0000000003030000
};
const struct {
const ITYPE_RAW *pdata;
unsigned num;
} indirdata[] = {
{corr, sizeof(corr)/(sizeof(ITYPE_RAW)*3)},
{demod, sizeof(demod)/(sizeof(ITYPE_RAW)*3)},
};
enum indirdata_indices {
INDIR_CORR = 0,
INDIR_DEMOD = 1,
};
const test_64b_item_t testdata[] = {
/* -------------------------------------------------------------------------- */
/* 1p32 */
// {OP | CTRL, DT, A, B, REF},
// {SW_OP_ADD | SW_CTRL_ALU, DT_1P32, 0x11111111, 0x00000000, 0x11111111},
// {SW_OP_ADD | SW_CTRL_ALU, DT_1P32, 0x10101010, 0x01010101, 0x11111111},
// {SW_OP_MUL | SW_CTRL_ALU | SW_CTRL_SATURATE, DT_1P32, 0x10101010, 0x01010101, 0x00000000},
#ifdef SWAR_CORR
{SW_OP_COR2b | SW_CTRL_SIGNED, DT_2P64 | DT_ARRAY, INDIR_CORR, 0, 0}, /* the 1st argument contains an index to indirdata (0=corr, 1=demod) */
{SW_OP_COR1b, DT_1P64 | DT_ARRAY | DT_SWREF, INDIR_CORR, 0, 0},
{SW_OP_COR3b | SW_CTRL_REDUCE, DT_3P64 | DT_ARRAY | DT_SWREF, INDIR_CORR, 0, 0},
{SW_OP_COR4b | SW_CTRL_SIGNED | SW_CTRL_REDUCE, DT_4P64 | DT_ARRAY | DT_SWREF, INDIR_CORR, 0, 0},
{SW_OP_DEMC2bG, DT_2P64 | DT_ARRAY, INDIR_DEMOD, 0, 0},
{SW_OP_DEMC2b, DT_2P64 | DT_ARRAY | DT_SWREF, INDIR_DEMOD, 0, 0},
{SW_OP_DEMC4b, DT_4P64 | DT_ARRAY | DT_SWREF, INDIR_DEMOD, 0, 0},
{SW_OP_DEMR3b, DT_3P64 | DT_ARRAY | DT_SWREF, INDIR_DEMOD, 0, 0},
#endif /* SWAR_CORR */
#ifdef SWAR_SINCOS
{SW_OP_SC4b, DT_4P64 | DT_ARRAY | DT_SWREF, INDIR_DEMOD, 0, 0},
{SW_OP_SC3b, DT_3P64 | DT_ARRAY | DT_SWREF, INDIR_DEMOD, 0, 0},
{SW_OP_SC2b, DT_2P64 | DT_ARRAY | DT_SWREF, INDIR_DEMOD, 0, 0},
{SW_OP_SC1b, DT_1P64 | DT_ARRAY | DT_SWREF, INDIR_DEMOD, 0, 0},
#endif /* SWAR_SINCOS */
#ifdef SWAR_ALU
{SW_OP_ADD | SW_CTRL_VIDEO, DT_8P64 | DT_ARRAY | DT_SWREF, INDIR_CORR, 0, 0},
{SW_OP_SUB | SW_CTRL_VIDEO, DT_8P64 | DT_ARRAY | DT_SWREF, INDIR_CORR, 0, 0},
{SW_OP_MUL | SW_CTRL_VIDEO, DT_8P64 | DT_ARRAY | DT_SWREF, INDIR_CORR, 0, 0},
{SW_OP_ADD | SW_CTRL_AUDIO, DT_16P64 | DT_ARRAY | DT_SWREF, INDIR_CORR, 0, 0},
{SW_OP_SUB | SW_CTRL_AUDIO, DT_16P64 | DT_ARRAY | DT_SWREF, INDIR_CORR, 0, 0},
{SW_OP_MUL | SW_CTRL_AUDIO, DT_16P64 | DT_ARRAY | DT_SWREF, INDIR_CORR, 0, 0},
/*
{SW_OP_ADD | SW_CTRL_ALU, DT_2P64 | DT_ARRAY | DT_SWREF, INDIR_CORR, 0, 0},
{SW_OP_SUB | SW_CTRL_ALU, DT_2P64 | DT_ARRAY | DT_SWREF, INDIR_CORR, 0, 0},
{SW_OP_MUL | SW_CTRL_ALU, DT_2P64 | DT_ARRAY | DT_SWREF, INDIR_CORR, 0, 0},
*/
#endif /* SWAR_ALU */
// get value from accumulator, 1st argument contains an index of accumulator register
{ACCGET, 0, 1, 0, 0},
};
#define SIZETESTDATA (sizeof(testdata)/sizeof(test_64b_item_t))
# Makefile for Whetstone project
BIN=wstone
# C sources
SOURCES=whets.c
HEADERS=../common_sys_header.inc
CFLAGS=-ggdb
# libraries in required order (compiled in the order)
LIBS=c m c bcc built
# Makefile for building BCC library
#include $(TOP)/Makefile.tools
# library specific compilation flags
CFLAGS=-O2 -ffunction-sections -fdata-sections $(LIBFLAGS)
# ------------------------------------------------------------------------------
# libbcc: from bsp
BCC_SRCPATH:=$(abspath ./bsp)
BSP_SRCPATH:=$(BCC_SRCPATH)/bsp/2020q4
# libc include
LIBC_SRCPATH:=$(abspath ./newlib/newlib/libc)
# read common list of source files
-include $(BCC_SRCPATH)/common.mk
EXTRA_S_SRCS:=shared/crt0.S shared/first.S
BCC_SOURCES:=$(BCC_COMMON_SOURCES) $(BCC_INT_IRQMP_SOURCES) $(BCC_APBUART_SOURCES) $(BCC_GPTIMER_SOURCES)
BCC_OBJS:=$(subst /,~,$(BCC_SOURCES:%=%.o))
BCC_C_SOURCES = $(filter %.c,$(BCC_SOURCES))
BCC_S_SOURCES = $(filter %.S,$(BCC_SOURCES))
BCC_C_OBJS=$(BCC_C_SOURCES:.c=.o)
BCC_S_OBJS=$(BCC_S_SOURCES:.S=.o)
BCC_C_OBJS_SUB=$(subst /,~,$(BCC_C_OBJS))
BCC_S_OBJS_SUB=$(subst /,~,$(BCC_S_OBJS))
BCC_EXTRA_OBJS=$(notdir $(EXTRA_S_SRCS:.S=.o))
BCC_C_OBJS_FULLPATH=$(BCC_C_OBJS_SUB:%=$(DSTDIR)/%)
BCC_S_OBJS_FULLPATH=$(BCC_S_OBJS_SUB:%=$(DSTDIR)/%)
BCC_EXTRA_OBJS_FULLPATH=$(addprefix $(DSTDIR)/,$(BCC_EXTRA_OBJS))
#$(info "TEST")
#$(info $(EXTRA_S_SRCS))
#$(info $(BCC_EXTRA_OBJS))
#$(info $(BCC_S_OBJS_FULLPATH))
#$(info $(BCC_EXTRA_OBJS_FULLPATH))
#$(info "END-TEST")
all: $(DSTDIR)/../include $(BCC_EXTRA_OBJS_FULLPATH) $(DSTDIR)/libbcc.a
@echo "BCC library is prepared"
# run building
$(DSTDIR)/libbcc.a: $(DSTDIR) $(BCC_SRCPATH)/common.mk $(BCC_C_OBJS_FULLPATH) $(BCC_S_OBJS_FULLPATH)
@$(AR) r $@ $(BCC_C_OBJS_FULLPATH) $(BCC_S_OBJS_FULLPATH)
# run copying include
$(DSTDIR)/../include:
@echo "Copy include"
@cp -r $(BCC_SRCPATH)/shared/include $@
@cp -r $(BSP_SRCPATH)/include/* $@
define GEN_C2O_RULE
$(DSTDIR)/$(ofile): $(DSTDIR)/$(ofile:%.o=%.c_S)
@echo "Assemble $$@ from $$<"
@$(AS) $(ASARCH) -gdwarf-5 -o $$@ $$<
$(DSTDIR)/$(ofile:%.o=%.c_S) : $(BCC_SRCPATH)/$(subst ~,/,$(notdir $(ofile:.o=.c))) $(DSTDIR)/../include $(LIBC_SRCPATH)/include
@echo "Compile '$$<' to '$$@'"
@$(CC) $(CCARCH) $(CFLAGS) -fno-addrsig -Wall -Wextra -pedantic -fno-builtin -I$(DSTDIR)/../include -I$(LIBC_SRCPATH)/include -S -o $$@ $$<
endef
$(foreach ofile, $(BCC_C_OBJS_SUB), \
$(eval $(GEN_C2O_RULE)) \
)
define GEN_S2O_RULE
$(DSTDIR)/$(ofile) : $(BCC_SRCPATH)/$(subst ~,/,$(notdir $(ofile:.o=.S))) $(DSTDIR)/../include
@$(CC) $(CCARCH) $(CFLAGS) -fno-addrsig -I$(DSTDIR)/../include -I$(BCC_SRCPATH)/shared/inc -S -o $$@_as $$<
@$(AS) $(ASARCH) -gdwarf-5 -I$(BCC_SRCPATH)/shared/inc -o $$@ $$@_as
endef
$(foreach ofile, $(BCC_S_OBJS_SUB), \
$(eval $(GEN_S2O_RULE)) \
)
define GEN_EXT_RULE
$(DSTDIR)/$(notdir $(oext:.S=.o)) : $(BCC_SRCPATH)/$(oext)
@$(CC) $(CCARCH) $(CFLAGS) -fno-addrsig -I$(DSTDIR)/../include -I$(BCC_SRCPATH)/shared/inc -S -o $$@_as $$<
@$(AS) $(ASARCH) -gdwarf-5 -I$(BCC_SRCPATH)/shared/inc -o $$@ $$@_as
endef
$(foreach oext, $(EXTRA_S_SRCS), \
$(eval $(GEN_EXT_RULE)) \
)
# ------------------------------------------------------------------------------
clean:
rm `find $(DSTDIR) -name "*.o"`
rm `find $(DSTDIR) -name "*.a"`
# Makefile for building 'built' library (built-in llvm functions)
#include $(TOP)/Makefile.tools
# library specific compilation flags
CFLAGS=-ffunction-sections -fdata-sections $(LIBFLAGS)
# ------------------------------------------------------------------------------
# libbuilt: from llvm/builtin
C_SRCPATH:=$(abspath ./builtins)
C_SRCS=$(notdir $(wildcard $(C_SRCPATH)/*.c))
LIBOBJS=$(C_SRCS:%.c=$(DSTDIR)/%.o)
#$(info TEST >>>>)
#$(info $(C_SRCS))
#$(info $(LIBOBJS))
#$(info <<<< TEST)
all: $(DSTDIR)/../include $(DSTDIR)/libbuilt.a
echo "BUILTIN library is prepared"
# create library
$(DSTDIR)/libbuilt.a: $(LIBOBJS)
@$(AR) r $@ $(LIBOBJS)
# run preparing copy of directory 'include'
$(DSTDIR)/../include:
mkdir $@
# compile objects
define GEN_C2O_RULE
$(DSTDIR)/$(cfile:.c=.o) : $(DSTDIR)/$(cfile:.c=.c_S)
@echo "Assemble $$@ from $$<"
@$(AS) $(ASARCH) -gdwarf-5 -o $$@ $$<
$(DSTDIR)/$(cfile:.c=.c_S) : $(C_SRCPATH)/$(cfile)
@echo "Compile '$$<' to '$$@' $(C_SRCPATH)"
@$(CC) $(CCARCH) $(CFLAGS) -fno-addrsig -I$(C_SRCPATH) -S -o $$@ $$<
endef
$(foreach cfile, $(C_SRCS), \
$(eval $(GEN_C2O_RULE)) \
)
# ------------------------------------------------------------------------------
clean:
rm `find $(DSTDIR) -name "*.o"`
rm `find $(DSTDIR) -name "*.a"`
# Makefile for building libc
#include $(TOP)/Makefile.tools
# library specific compilation flags
CFLAGS=-ffreestanding -ffunction-sections -fdata-sections -Wno-unknown-pragmas $(LIBFLAGS)
all: $(LIBFILE) $(DSTDIR)/../include
@echo "NEWLIB (C/M) library is prepared"
# ------------------------------------------------------------------------------
# libc: from newlib
# build with configure/make
NEWLIB_SRCPATH=$(abspath ./newlib/newlib)
#C_SRCPATH=$(abspath ./newlib/newlib/libc)
#M_SRCPATH=$(abspath ./newlib/newlib/libm)
SRCPATH=$(abspath ./newlib/newlib/$(LIB))
NEWLIB_DSTDIR=$(DSTDIR)/../../newlib/$(ARCHID)
$(NEWLIB_DSTDIR):
mkdir -p $(NEWLIB_DSTDIR)
$(DSTDIR):
mkdir -p $(DSTDIR)
# run building
$(DSTDIR)/libc.a: $(DSTDIR) $(NEWLIB_DSTDIR) $(NEWLIB_DSTDIR)/libc.a
cp $(NEWLIB_DSTDIR)/libc.a $@
$(DSTDIR)/libm.a: $(DSTDIR) $(NEWLIB_DSTDIR) $(NEWLIB_DSTDIR)/libm.a
cp $(NEWLIB_DSTDIR)/libm.a $@
$(NEWLIB_DSTDIR)/libc.a: $(NEWLIB_DSTDIR)/Makefile
(cd $(NEWLIB_DSTDIR); AR_FLAGS=r make;)
$(NEWLIB_DSTDIR)/libm.a: $(NEWLIB_DSTDIR)/Makefile
(cd $(NEWLIB_DSTDIR); AR_FLAGS=r make;)
# run configure
$(NEWLIB_DSTDIR)/Makefile: $(NEWLIB_SRCPATH)/configure
(cd $(NEWLIB_DSTDIR); CC=$(CC) CFLAGS="$(CCARCH) $(CFLAGS) -I$(SRCPATH)/include -I$(SRCPATH)/../libc/machine/riscv --sysroot=$(NEWLIB_DSTDIR)/sysroot64" RANLIB=$(RANLIB) AR=$(AR) $(NEWLIB_SRCPATH)/configure --build=$(TOOLCHAIN) --prefix=$(NEWLIB_DSTDIR)/sysroot64 --disable-newlib-fno-builtin --enable-silent-rules --enable-newlib-nano-malloc;)
# --enable-newlib-nano-formatted-io
# --disable-newlib-io-float --enable-newlib-nano-malloc --enable-newlib-nano-formatted-io
# -Werror
# run preparing copy of directory 'include' (libc)
ifeq ($(LIB),libc)
$(DSTDIR)/../include: $(SRCPATH)
cp -r $(SRCPATH)/include $(DSTDIR)/../include
cp $(SRCPATH)/machine/riscv/sys/*.h $(DSTDIR)/../include/sys
endif
ifeq ($(LIB),libm) # run copying include (libm)
$(DSTDIR)/../include:
mkdir -p $@
cp $(SRCPATH)/machine/riscv/*.h $@
endif
# ------------------------------------------------------------------------------
clean:
rm `find $(DSTDIR) $(DSTDIR) -name "*.o"`
rm `find $(DSTDIR) -name "*.a"`
Makefile.libc
\ No newline at end of file
# Makefile for building soft-float library
#include $(TOP)/Makefile.tools
# library specific compilation flags
# always all in soft-float
CFLAGS=-msoft-float $(LIBFLAGS)
#-O2
# ------------------------------------------------------------------------------
# libsf: from soft-float
SF_SRCPATH:=$(abspath ./softfloat)
$(info >>> LIBRARY -SoftFloat)
$(info . LIB=$(LIB))
$(info . LIBFILE=$(LIBFILE))
$(info . TOP=$(TOP))
$(info . DSTDIR=$(DSTDIR))
$(info . CCARCH="$(CCARCH))
$(info . ARCHID=$(ARCHID))
$(info . TOOL=$(TOOL))
$(info . TOOLCHAIN=$(TOOLCHAIN))
$(info . CC=$(CC))
$(info . LD=$(LD))
$(info . AS=$(AS))
$(info . AR=$(AR))
$(info . RANLIB=$(RANLIB))
SF_TARGET=DAITEQ-NOELV-LLVM
all: $(DSTDIR)/../include $(DSTDIR)/libsf.a
@echo "SoftFloat library is prepared"
$(DSTDIR)/../include:
@mkdir -p $@
$(DSTDIR)/libsf.a: $(SF_SRCPATH)/build/$(SF_TARGET)
@echo "softfloat in $<"
@make -C $< -f Makefile export TOP="$(TOP)" TOOL=$(TOOL) TOOLCHAIN=$(TOOLCHAIN) CC=$(CC) AR=$(AR) DSTPATH="$(DSTDIR)" INCPATH="$(DSTDIR)/../include" CFLAGS="$(CFLAGS)" CCARCH="$(CCARCH)"
# ------------------------------------------------------------------------------
clean:
rm `find $(DSTDIR) -name "*.o"`
rm `find $(DSTDIR) -name "*.a"`
Libraries for RISC-V
====================
The library 'newlib' has to be initialized with the included './init.sh' script.
The script clone git repository and it also applies the patches.
Libraries contained in the package and used in the examples:
* newlib (libc, libm) - cloned from newlib-cygwin repository (commit 6238b1877d3c83b8b9ed589494d0705386b7f4f7) and patched to support half FP data type
* BSP (libbcc) - copied from Cobham Gaisler NOEL-V bare-metal C/C++ toolchain (ncc 1.0.4)
* builtins (libbuilt) - LLVM built-in functions
* softfloat (libsf) - copy of John R. Hauser SoftFloat library (currently not used in examples)
bsp/*/build
dist/
# if CC is not set or empty use riscv-gaisler-elf-gcc
ifeq (,$(findstring clang,$(CC)))
override CC := riscv-gaisler-elf-gcc
MULTILIBS = multilibs_gcc
else
MULTILIBS = multilibs_llvm
endif
VER=2020q4
all:
for d in `dir bsp`; do \
if [ -f bsp/$$d/$(MULTILIBS) ] ; \
then \
for ml in `cat bsp/$$d/$(MULTILIBS)`; do \
ml_dir=`echo $$ml | cut '-d;' -f1`; \
ml_opt=`echo $$ml | cut '-d;' -f2 | sed 's/@/ -/g'`; \
echo PATH="$(PATH)"; \
echo MAKE="$(MAKE)"; \
echo BSPNAME="$${d}"; \
echo MULTI_DIR="$${ml_dir}"; \
echo MULTI_FLAGS="$${ml_opt}"; \
done; \
fi \
done;
one:
$(MAKE) PATH=$(PATH) -C bsp/$(VER) -f bsp.mk BSPNAME=$(VER) MULTI_DIR="$(MARCH)/$(MABI)" MULTI_FLAGS="-march=$(MARCH) -mabi=$(MABI)" all install
# if CC is not set or empty use riscv-gaisler-elf-gcc
ifeq (,$(findstring clang,$(CC)))
override CC := riscv-gaisler-elf-gcc
MULTILIBS = multilibs_gcc
else
MULTILIBS = multilibs_llvm
endif
all:
for d in `dir bsp`; do \
if [ -f bsp/$$d/$(MULTILIBS) ] ; \
then \
for ml in `cat bsp/$$d/$(MULTILIBS)`; do \
ml_dir=`echo $$ml | cut '-d;' -f1`; \
ml_opt=`echo $$ml | cut '-d;' -f2 | sed 's/@/ -/g'`; \
$(MAKE) PATH=$(PATH) -C bsp/$${d} -f bsp.mk BSPNAME=$${d} MULTI_DIR="$${ml_dir}" MULTI_FLAGS="$${ml_opt}" all install || exit 1; \
done; \
fi \
done;
# Introduction
NCC is a bare-metal toolchain for NOEL-V. It can be used for developing
single-threaded C and C++ applications.
Use cases
* RTL simulation
* Build benchmarks for HW and sim. execution
* Try out new features and reproduce HW bugs
* Environment for single-threaded applications
* A set of tools useful when developing a custom run-time or boot loader.
## Getting started
Extract the binary distribution and add bin to PATH. For example
cd /opt
tar xf ncc-1.0.0-gcc.tar.bz2
PATH="$PATH:/opt/ncc-1.0.0-gcc/bin"
The following examples is expected to "just work" on a 64-bit NOEL-V
system.
riscv-gaisler-elf-gcc hello.c -o hello.elf
It creates a program with the following characteristics:
* RV64IM instruction set
* linked to address 0
* UART, TIMER, PLIC is probed using Plug&Play.
* Plug&Play ioarea located at 0xFFF00000.
## Multilibs
The default multilib when (no march or mabi) is used corresponds to
-march=rv64im -mabi=lp64.
$ riscv-gaisler-elf-gcc -print-multi-lib
.;
rv32i/ilp32;@march=rv32i@mabi=ilp32
rv32ic/ilp32;@march=rv32ic@mabi=ilp32
rv32im/ilp32;@march=rv32im@mabi=ilp32
rv32imc/ilp32;@march=rv32imc@mabi=ilp32
rv32ima/ilp32;@march=rv32ima@mabi=ilp32
rv32imac/ilp32;@march=rv32imac@mabi=ilp32
rv32imafd/ilp32d;@march=rv32imafd@mabi=ilp32d
rv32imafdc/ilp32d;@march=rv32imafdc@mabi=ilp32d
rv64ima/lp64;@march=rv64ima@mabi=lp64
rv64imac/lp64;@march=rv64imac@mabi=lp64
rv64imafd/lp64d;@march=rv64imafd@mabi=lp64d
rv64imafdc/lp64d;@march=rv64imafdc@mabi=lp64d
Note that the above list may change in the future.
# libbcc
The library consists of:
- Run-time statup code.
- Functions for programming NOEL-V.
- Implementation of newlib C standard library portability layer.
The library is available in the target library file
libbcc.a. There are multiple versions of libbcc.a, customized
for specific BSPs and compiler options (GCC multilibs). The
exact versions of the library is selected based on compiler
command line parameters. This also reflects that different
low-level drivers are implemented for different hardware.
## Usage
Functions described in the summary below have prototypes in
the header file bcc/bcc.h. The functions are implemented in
libbcc.a and are available per default when linking with the
GCC front-end. The same user API is available independent of
target NOEL-V hardware.
API summary follows. For API details, see bcc/bcc.h.
## Interrupt based timer service
Use bcc_timer_tick_init_period() to allow for the timer services
to not wrap around after 2^32 microseconds.
## Exception API
Execution aborts with ebreak by default for non-interrupt
exceptions.
This behavior can be overridden in the user application
by implementing a function named __bcc_handle_exception().
The prototype is in <bcc/bcc_param.h>. See also the example
named "exception.c".
## Interrupt disable and enable
All maskable interrupts are disabled with bcc_int_disable()
and enabled again with bcc_int_enable(). A nesting mechanism
allows multiple disable operations to be performed in sequence
without the corresponding enable operation inbetween.
Interrupts are in the enabled state when main() is called.
## Interrupt source masking
An interrupt source can be masked (disabled) with bcc_int_mask()
and unmasked (enabled) with bcc_int_unmask().
## Interrupt service routines
Functions are provided for the user to install custom interrupt
service routines. It is possible to install multiple interrupt
handlers for the same interrupt: this is referred to as
interrupt sharing.
bcc_isr_register() and bcc_isr_unregister() can be used
to install an interrupt handler. These functions manage
memory allocation automatically by using malloc() and free()
internally.
## Processor API
libbcc is a single-threaded single-processor run-time. A limited
set of functionlity is provided to start secondary processors.
The number of processors in the system can be retrieved with
the function bcc_get_cpu_count() and the ID of the current
processor is bcc_get_cpuid().
A secondary processor (any CPU which is not CPU0) in a
multiprocessor NOEL-V system can be started by calling
bcc_start_processor().
The entry point and stack pointer is picked up from the
following data structure by the secondary processor:
struct bcc_startinfo {
void (*pc)(int mhartid);
uintptr_t sp;
};
extern struct bcc_startinfo __bcc_startinfo[16];
An example named cpustart.c is available which demonstrates
how this can be done.
# Board support packages
A BSP is selected with the GCC option -qbsp=bspname, where
bspname specifies any of the BSPs described in this chapter. The
option is typically combined with -march=cpuname.
It is important that the -qbsp=, -march= and -mabi= options
are given to GCC both at the compile and link steps.
If option -qbsp= is not given explicitly, then -qbsp=2020q4
is implied.
## 2020q4
- 32-bit and 64-bit systems
- Uses AMBA Plug & Play to scan for devices.
- ioarea at 0xFFF00000
- RAM at 0x00000000
## Remapping ioarea
You can override the Plug&Play ioarea assumed by the run-time by defining the
global variable uintptr_t __bcc_ioarea. For example by adding the following in
global scope of a C file:
uintptr_t __bcc_ioarea 0xffe00000;
# newlib
Low-level functionality required by newlib C standard library
is implemented in the NOEL-V specific layer (libbcc).
## File I/O
The target newlib supports file I/O on the standard
input, standard output and standard error files
(stdin/stdout/stderr). These files are always open and are
typically associated with an APBUART.
There is no support in libbcc for operating on disk files. There
is no file system support.
## Time functions
GPTIMER timer is used to generate the system time. The C
standard library functions time() and clock() return the time
elapsed in seconds and microseconds respectively. times()
and gettimeofday(), defined by POSIX, are also available.
## Dynamic memory allocation
Dynamic memory can be allocated/deallocated using for example
malloc(), calloc() and free(). For information on customizing
the memory heap,
# FAQ
Q: Which NOEL-V specific functions are available?
For the NOEL-V API functions available to applications, see the file
libbcc/shared/include/bcc/bcc.h
Q: How do I install an exception handler?
Override the function __bcc_handle_exception(). there is a function prototype in
libbcc/shared/include/bcc/bcc_param.h.
See also the the example in examples/exception/exception.c.
Q: How do I install an interrupt handler?
Use bcc_isr_register() and bcc_isr_unregister()
(libbcc/shared/include/bcc/bcc.h).
Q: How can I start the other processors?
See the example examples/cpustart/cpustart.c.
Q: Does the runtime support SMP?
The provided C standard library and libbcc supports a single-threaded
single-processor run-time.
Functionality is available to start secondary processors. See the "cpustart"
example.
Q: Is the C standard library available?
Yes newlib is available. malloc(), snprintf(), printf, etc can be used.
File I/O is limited to the files stdin, stdout and stderr.
#
# libBCC
################################################################################
#
# Required variables:
# -------------------
# CC - Compiler
# CFLAGS - Compiler flags
# AS - Assembler
# ASFLAGS - Assembler flags
# AR - Archiver
# ARFLAGS - Archiver flags
# OUTDIR - Build directory path
#
# Optional vaiables:
# ------------------
# BCC_XCFLAGS - Extra compiler flags
# BCC_XASFLAGS - Extra assembler flags
# BCC_XARFLAGS - Extra archiver flags
# BCC_LIBDIR - Path to where the archive will be located.
# If OUTDIR is set it defaults to $(OUTDIR)/bcc/lib.
# BCC_OBJDIR - Path to where the object files will be located
# If OUTDIR is set it defaults to $(OUTDIR)/bcc/obj.
# BCC_SOURCES - Override source selection
# EXTRA_SOURCES - Override extra source selection (not archived in LIBBCC)
#
# Exported variabels:
# -------------------
# BCC_PATH - Path to BCC root directory
# BCC_CLEAN - Files to be cleaned
#
# BCC_INCLUDE - Inlude flags
# BCC_LIBS - Linker libraries flags
# BCC_OBJECTS - All BCC object files
# EXTRA_OBJECTS - All extra object files
# LIBBCC - BCC library target
#
###############################################################################
ifeq ($(BCC_PATH),)
BCC_PATH :=$(shell dirname $(lastword $(MAKEFILE_LIST)))
endif
BCC_CLEAN =$(BCC_OBJDIR)/* $(LIBBCC)
BCC_INCLUDE =-I$(BCC_PATH)/shared/include
BCC_LIBS =-L$(BCC_LIBDIR) -lbcc
BCC_OBJECTS =$(BCC_SOURCES:%=$(BCC_OBJDIR)/%.o)
EXTRA_OBJECTS =$(EXTRA_SOURCES:%=$(BCC_OBJDIR)/%.o)
ALL_OBJECTS = $(BCC_OBJECTS) $(EXTRA_OBJECTS)
LIBBCC =$(BCC_LIBDIR)/libbcc.a
BCC_SOURCES ?=
EXTRA_SOURCES ?=
_BCC_DEPS = $(ALL_OBJECTS:%.o=%.d)
ifeq ($(OUTDIR),)
$(error "OUTDIR not set")
#_BCC_OUTDIR=$(BCC_PATH)
else
_BCC_OUTDIR=$(OUTDIR)/$(BCC_PATH)
endif
ifeq ($(BCC_LIBDIR),)
BCC_LIBDIR=$(_BCC_OUTDIR)/lib
endif
ifeq ($(BCC_OBJDIR),)
BCC_OBJDIR=$(_BCC_OUTDIR)/obj
endif
$(filter %.c.d,$(_BCC_DEPS)): $(BCC_OBJDIR)/%.d: $(BCC_PATH)/%
@test -d $(@D) || mkdir -p $(@D)
@set -e; rm -f $@; \
$(CC) $(CFLAGS) $(BCC_XCFLAGS) $(BCC_INCLUDE) -MM -MT $(@:.d=.o) $< > $@.$$$$; \
sed 's,\($*\)\.o[ :]*,\1.o $@ : ,g' < $@.$$$$ > $@; \
rm -f $@.$$$$
$(filter %.S.d,$(_BCC_DEPS)): $(BCC_OBJDIR)/%.d: $(BCC_PATH)/%
@test -d $(@D) || mkdir -p $(@D)
@set -e; rm -f $@; \
$(CC) $(ASFLAGS) $(BCC_XASFLAGS) $(BCC_INCLUDE) -MM -MT $(@:.d=.o) $< > $@.$$$$; \
sed 's,\($*\)\.o[ :]*,\1.o $@ : ,g' < $@.$$$$ > $@; \
rm -f $@.$$$$
-include $(_BCC_DEPS)
# Workaround to generate errors for missing sourcefiles
$(addprefix $(BCC_PATH)/,$(BCC_SOURCES)): %:
$(error File $@ not found)
$(addprefix $(BCC_PATH)/,$(EXTRA_SOURCES)): %:
$(error File $@ not found)
$(filter %.c.o,$(ALL_OBJECTS)): $(BCC_OBJDIR)/%.o: $(BCC_PATH)/%
@test -d $(@D) || mkdir -p $(@D)
$(CC) $(CFLAGS) $(BCC_XCFLAGS) $(BCC_INCLUDE) $(_BCC_OSAL_DEFINE) -o $@ -c $<
$(filter %.S.o,$(ALL_OBJECTS)): $(BCC_OBJDIR)/%.o: $(BCC_PATH)/%
@test -d $(@D) || mkdir -p $(@D)
$(AS) $(ASFLAGS) $(BCC_XASFLAGS) $(BCC_INCLUDE) -o $@ -c $<
$(LIBBCC): $(BCC_OBJECTS)
@test -d $(@D) || mkdir -p $(@D)
$(AR) $(ARFLAGS) $(BCC_XAFRLAGS) $@ $(filter %.o,$^)
32-bit and 64-bit systems
Uses AMBA Plug & Play to scan for devices etc.
ioarea at 0xFFF00000
RAM at 0x00000000
all:
include ../../defs.mk
include ../../common.mk
BCC_XCFLAGS =
BCC_XASFLAGS =
# Include list of common sources files for this BSP
BCC_SOURCES = $(BCC_COMMON_SOURCES)
# BSP specific sources
BCC_SOURCES += $(BCC_INT_IRQMP_SOURCES)
BCC_SOURCES += $(BCC_APBUART_SOURCES)
BCC_SOURCES += $(BCC_GPTIMER_SOURCES)
# Local BSP sources
EXTRA_SOURCES += shared/crt0.S
EXTRA_SOURCES += shared/first.S
EXTRA_DATA = $(COMMON_EXTRA_DATA)
include ../../bcc.mk
include ../../targets.mk
/*
* Copyright (c) 2017, Cobham Gaisler AB
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* 1. Redistributions of source code must retain the above copyright notice, this
* list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright notice,
* this list of conditions and the following disclaimer in the documentation
* and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
* LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*/
#ifndef __BSP_H_
#define __BSP_H_
#define __BSP_CON_HANDLE 0
#define __BSP_TIMER_HANDLE 0
#define __BSP_TIMER_INTERRUPT 0
#define __BSP_PLIC_HANDLE 0
#define __BSP_CLINT_HANDLE 0
#define __BSP_IOAREA 0xFFF00000U
#define __BSP_CPU_COUNT 0
#endif
MEMORY
{
rom : ORIGIN = 0xC0000000, LENGTH = 512M
ram : ORIGIN = 0x00000000, LENGTH = 2048M
}
.;
rv32i/ilp32;@march=rv32i@mabi=ilp32
rv32ic/ilp32;@march=rv32ic@mabi=ilp32
rv32im/ilp32;@march=rv32im@mabi=ilp32
rv32imc/ilp32;@march=rv32imc@mabi=ilp32
rv32ima/ilp32;@march=rv32ima@mabi=ilp32
rv32imac/ilp32;@march=rv32imac@mabi=ilp32
rv32imafd/ilp32d;@march=rv32imafd@mabi=ilp32d
rv32imafdc/ilp32d;@march=rv32imafdc@mabi=ilp32d
rv64ima/lp64;@march=rv64ima@mabi=lp64
rv64imac/lp64;@march=rv64imac@mabi=lp64
rv64imafd/lp64d;@march=rv64imafd@mabi=lp64d
rv64imafdc/lp64d;@march=rv64imafdc@mabi=lp64d
.;
rv32i/ilp32;@march=rv32i@mabi=ilp32
rv32ic/ilp32;@march=rv32ic@mabi=ilp32
rv32im/ilp32;@march=rv32im@mabi=ilp32
rv32imc/ilp32;@march=rv32imc@mabi=ilp32
rv32ima/ilp32;@march=rv32ima@mabi=ilp32
rv32imac/ilp32;@march=rv32imac@mabi=ilp32
rv32imafd/ilp32d;@march=rv32imafd@mabi=ilp32d
rv32imafdc/ilp32d;@march=rv32imafdc@mabi=ilp32d
rv64ima/lp64;@march=rv64ima@mabi=lp64
rv64imac/lp64;@march=rv64imac@mabi=lp64
rv64imafd/lp64d;@march=rv64imafd@mabi=lp64d
rv64imafdc/lp64d;@march=rv64imafdc@mabi=lp64d
BCC_COMMON_SOURCES =
BCC_COMMON_SOURCES += shared/mp.c
BCC_COMMON_SOURCES += shared/isr.c
BCC_COMMON_SOURCES += shared/isr_register_node.c
BCC_COMMON_SOURCES += shared/isr_unregister_node.c
BCC_COMMON_SOURCES += shared/isr_register.c
BCC_COMMON_SOURCES += shared/isr_unregister.c
BCC_COMMON_SOURCES += shared/init.S
BCC_COMMON_SOURCES += shared/dwzero.c
BCC_COMMON_SOURCES += shared/copy_data.c
BCC_COMMON_SOURCES += shared/ambapp.c
BCC_COMMON_SOURCES += shared/ambapp_findfirst_fn.c
BCC_COMMON_SOURCES += shared/lowlevel.S
BCC_COMMON_SOURCES += shared/_exit.S
BCC_COMMON_SOURCES += shared/read.c
BCC_COMMON_SOURCES += shared/sbrk.c
BCC_COMMON_SOURCES += shared/heap.c
BCC_COMMON_SOURCES += shared/times.c
BCC_COMMON_SOURCES += shared/gettimeofday.c
BCC_COMMON_SOURCES += shared/write.c
BCC_COMMON_SOURCES += shared/stubs/stubs.S
BCC_COMMON_SOURCES += shared/stubs/environ.c
BCC_COMMON_SOURCES += shared/handle_exception.c
BCC_COMMON_SOURCES += shared/handle_interrupt.c
BCC_COMMON_SOURCES += shared/ioarea.c
BCC_COMMON_SOURCES += shared/argv.c
BCC_INT_IRQMP_SOURCES = shared/interrupt/int_irqmp_handle.c
BCC_INT_IRQMP_SOURCES += shared/interrupt/int_irqmp.c
BCC_INT_IRQMP_SOURCES += shared/interrupt/int_irqmp_get_source.c
BCC_INT_IRQMP_SOURCES += shared/interrupt/int_irqmp_init.c
BCC_INT_IRQMP_SOURCES += shared/interrupt/cpu_count.c
BCC_APBUART_SOURCES = shared/console/con_handle.c
BCC_APBUART_SOURCES += shared/console/con_apbuart.c
BCC_APBUART_SOURCES += shared/console/con_apbuart_init.c
BCC_GPTIMER_SOURCES = shared/timer/timer_handle.c
BCC_GPTIMER_SOURCES += shared/timer/timer_custom.c
BCC_GPTIMER_SOURCES += shared/timer/timer_gptimer.c
BCC_GPTIMER_SOURCES += shared/timer/timer_gptimer_tick.c
BCC_GPTIMER_SOURCES += shared/timer/timer_gptimer_init.c
COMMON_EXTRA_DATA = $(BSP_DIR)/linkcmds.memory
COMMON_EXTRA_DATA += $(BCC_PATH)/shared/linkcmds.base
COMMON_EXTRA_DATA += $(BCC_PATH)/shared/linkcmds
COMMON_EXTRA_DATA += $(BCC_PATH)/shared/linkcmds-rom
COMMON_EXTRA_DATA += $(BCC_PATH)/shared/linkcmds-any
ifeq ($(BSPNAME),)
$(error "BSPNAME not set")
endif
DEFS_DIR :=$(shell dirname $(lastword $(MAKEFILE_LIST)))
BSP_DIR := $(DEFS_DIR)/bsp/$(BSPNAME)
ifeq ($(BCC_PATH),)
BCC_PATH :=$(DEFS_DIR)
endif
PREFIX?=riscv-gaisler-elf-
# CC is overwritten when using clang/llvm
CC?=$(PREFIX)gcc
AS=$(CC)
AR=$(PREFIX)ar
ASFLAGS=-I$(BCC_PATH)/shared/inc $(MULTI_FLAGS)
ASFLAGS+=-g
CFLAGS=-std=c99 -g -O3 -Wall -Wextra -pedantic $(MULTI_FLAGS) -I$(BSP_DIR)/include
CFLAGS+=-fno-builtin
ifneq ($(ADDCFLAGS),)
CFLAGS += $(ADDCFLAGS)
endif
ifeq ($(DESTDIR),)
BCC_DISTDIR=$(DEFS_DIR)/dist
else
BCC_DISTDIR=$(DESTDIR)
endif
$(info BCC_DISTDIR=$(BCC_DISTDIR))
ifeq ($(BUILDDIR),)
OUTDIR=build/$(BSPNAME)/$(MULTI_DIR)
else
OUTDIR=$(BUILDDIR)/$(BSPNAME)/$(MULTI_DIR)
endif
$(info OUTDIR=$(OUTDIR))
BCC_OBJDIR=$(OUTDIR)/obj
BCC_LIBDIR=$(OUTDIR)/lib
INSTALL_DATA = install -m 644
/*
* Copyright (c) 2017, Cobham Gaisler AB
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* 1. Redistributions of source code must retain the above copyright notice, this
* list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright notice,
* this list of conditions and the following disclaimer in the documentation
* and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
* LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*/
/* Implementation of _exit. */
.include "macros.i"
.section ".text"
.global _exit
FUNC_BEGIN _exit
ebreak
FUNC_END _exit
/*
* Copyright (c) 2017, Cobham Gaisler AB
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* 1. Redistributions of source code must retain the above copyright notice, this
* list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright notice,
* this list of conditions and the following disclaimer in the documentation
* and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
* LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*/
#include <stdint.h>
#include <stdlib.h>
#include "include/bcc/ambapp.h"
uint32_t ambapp_findfirst_fn(
void *info,
uint32_t vendor,
uint32_t device,
uint32_t type,
uint32_t depth,
void *arg
)
{
(void) vendor;
(void) device;
(void) depth;
struct amba_apb_info *apbi = info;
struct amba_ahb_info *ahbi = info;
if (NULL != arg) {
if (AMBAPP_VISIT_APBSLAVE == type) {
struct amba_apb_info *uinfo = arg;
*uinfo = *apbi;
} else if (
(AMBAPP_VISIT_AHBSLAVE == type) ||
(AMBAPP_VISIT_AHBMASTER == type)
) {
struct amba_ahb_info *uinfo = arg;
*uinfo = *ahbi;
}
}
return apbi->start;
}
#include <stddef.h>
#include "bcc/bcc_param.h"
/*
* 5.1.2.2.1 Program startup
* ...
* -- argv[argc] shall be a null pointer.
*/
int __bcc_argc = 0;
/* "Array of char pointers" */
static char (*__bcc_argv)[] = { NULL };
/* "Pointer to array of char pointers" */
char *((*__bcc_argvp)[]) = &__bcc_argv;
/*
* Copyright (c) 2017, Cobham Gaisler AB
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* 1. Redistributions of source code must retain the above copyright notice, this
* list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright notice,
* this list of conditions and the following disclaimer in the documentation
* and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
* LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*/
#include "bcc/bcc_param.h"
#include "bcc/bcc.h"
#include "bcc/regs/apbuart.h"
enum {
FIFO_UNKNOWN,
FIFO_YES,
FIFO_NO,
};
static int fifoinfo = FIFO_UNKNOWN;
int __attribute__((weak)) __bcc_con_outbyte(char x)
{
if (0 == __bcc_con_handle) { return 0; }
volatile struct apbuart_regs *regs = (void *) __bcc_con_handle;
int fi;
/* Use transmitter FIFO if available */
again:
fi = fifoinfo;
if (FIFO_YES == fi) {
/* Transmitter FIFO full flag is available */
while (regs->status & APBUART_STATUS_TF);
} else if (FIFO_NO == fi) {
/*
* Transmitter "hold register empty" AKA "FIFO empty" flag is
* available
*/
while (!(regs->status & APBUART_STATUS_HOLD_REGISTER_EMPTY));
} else {
/* First time: probe */
if (regs->ctrl & APBUART_CTRL_FA) {
fifoinfo = FIFO_YES;
} else {
fifoinfo = FIFO_NO;
}
goto again;
}
regs->data = x & 0xff;
return 0;
}
char __attribute__((weak)) __bcc_con_inbyte(void)
{
if (0 == __bcc_con_handle) { return 0; }
volatile struct apbuart_regs *regs = (void *) __bcc_con_handle;
while (0 == (regs->status & APBUART_STATUS_DR));
return regs->data & 0xff;
}
/*
* Copyright (c) 2017, Cobham Gaisler AB
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* 1. Redistributions of source code must retain the above copyright notice, this
* list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright notice,
* this list of conditions and the following disclaimer in the documentation
* and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
* LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*/
#include <stdint.h>
#include <stdlib.h>
#include <bcc/bsp.h>
#include "bcc/bcc_param.h"
#include "bcc/ambapp.h"
int __bcc_con_init(void)
{
/*
* If the BSP has set a handle value at compile-time then just return.
* The rest of this function will be optimized away.
*/
if (__BSP_CON_HANDLE) {
return BCC_OK;
}
/* Skip scanning if handle was defined at link time. */
if (__bcc_con_handle) {
return BCC_OK;
}
__bcc_con_handle = ambapp_visit(
__bcc_ioarea,
VENDOR_GAISLER,
GAISLER_APBUART,
AMBAPP_VISIT_APBSLAVE,
4,
ambapp_findfirst_fn,
NULL
);
if (__bcc_con_handle) {
return BCC_OK;
} else {
return BCC_NOT_AVAILABLE;
}
}
/*
* Copyright (c) 2017, Cobham Gaisler AB
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* 1. Redistributions of source code must retain the above copyright notice, this
* list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright notice,
* this list of conditions and the following disclaimer in the documentation
* and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
* LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*/
#include <stdint.h>
#include <bcc/bsp.h>
#include "bcc/bcc_param.h"
#ifndef __BSP_CON_HANDLE
#define __BSP_CON_HANDLE 0
#endif
uintptr_t __bcc_con_handle = __BSP_CON_HANDLE;
/*
* Copyright (c) 2017, Cobham Gaisler AB
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* 1. Redistributions of source code must retain the above copyright notice, this
* list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright notice,
* this list of conditions and the following disclaimer in the documentation
* and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
* LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*/
#include <stdint.h>
/* .data Load Memory Address (In ROM if ROM resident application) */
extern const uint32_t __data_start_lma;
/* Start/end of .data Run-time Memory Address */
extern uint32_t __data_start;
extern uint32_t __data_end;
/*
* IF .data LMA and VMA are equal THEN
* We are on a "RAM image" and nothing shall be copied.
* ELSE
* We are on a "ROM resident image" and .data has to be copied from
* persistent storage (ROM) to RAM.
* END IF
*/
void __bcc_copy_data(void)
{
const uint32_t *src = &__data_start_lma;
uint32_t *dst = &__data_start;
if (src == dst) {
return;
}
while (dst < &__data_end) {
*dst++ = *src++;
}
}
/*
* Copyright (c) 2017, Cobham Gaisler AB
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* 1. Redistributions of source code must retain the above copyright notice, this
* list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright notice,
* this list of conditions and the following disclaimer in the documentation
* and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
* LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*/
#include <macros.i>
.section ".text"
.global __bcc_crt0
/*
* At entry to __crt0_entry, the following is assumed:
* - %sp points to top of stack
* - %wim and %psr.cwp are valid
* - %psr.et=1
*/
FUNC_BEGIN __bcc_crt0
call __bcc_init50
/* NOTE: Assuming %sp is already set by boot loader. */
mv s0, sp
/*
* If the symbol __bcc_cfg_skip_clear_bss has non-zero value, then
* clear .bss. Otherwise do not clear .bss.
*/
la t0, __bcc_cfg_skip_clear_bss
bnez t0, .Lskip_clear_bss
/* Clear bss */
la a0, __bss_start
la a2, __bss_end
/* Number of double words. */
sub a3, a2, a0
srl a1, a3, 3
call bcc_dwzero
.Lskip_clear_bss:
la t0, __bcc_sp_at_entry
SREG s0, 0(t0)
/*
* Copy .data from Load Memory Address (LMA) to Virtual Memory Address
* (VMA) if needed.
*/
call __bcc_copy_data
/* .data can be referenced only after return from __bcc_copy_data(). */
call __bcc_init60
#if 0
call __bcc_get_leon_info
#endif
call __bcc_con_init
call __bcc_timer_init
call __bcc_int_init
la a0, __libc_fini_array
call atexit
call __libc_init_array
call __bcc_init70
la t0, __bcc_argc
lw a0, 0(t0)
la t0, __bcc_argvp
LREG a1, 0(t0)
/* In case someone tries to reach environment. */
mv a2, zero
call main
call exit
ebreak
FUNC_END __bcc_crt0
.section .bss
.global __bcc_sp_at_entry
.align 8
__bcc_sp_at_entry:
.dword 0
This diff is collapsed. Click to expand it.
This diff could not be displayed because it is too large.
Markdown is supported
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!