CPU端序
sw26010和x86一样都是小端序。测试程序如下,输出均为1
1 2 3 4 5 6 7 8 9 10 11 #include <stdio.h> int is_little_endian () { short s = 0x0110 ; char *p = (char *) &s; return (p[0 ] == 0x10 ); } int main () { printf ("%d\n" , is_little_endian() ); return 0 ; }
编译相关
输出头文件路径
1 2 echo | sw5cc -host -E -Wp,-v -echo | sw5cc -slave -E -Wp,-v -
关键的头文件路径
/usr/sw-mpp/swcc/lib/gcc-lib/sw_64-swcc-linux/5.421-sw-500/include
包含SIMD
,DMA
相关函数
/usr/sw-mpp/swcc/sw5gcc-binary/include
包含LDM
,FFT
,Athread
相关函数
调试相关
作业系统
设置Log Level
System相关
x86节点系统:Red Hat Enterprise Linux Server release 6.6
sw节点系统:RaiseOS
RaiseOS
大家都觉得这系统应该跟Busybox构建的rootfs差不多,也就是说整个申威节点就是大号开发板。下面是申威节点上某时刻的进程列表。顺带一提,这系统连bash
都没有。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 PID USER TIME COMMAND 1 root 0:16 init 2 root 0:00 [kthreadd] 3 root 0:00 [ksoftirqd/0] 5 root 0:00 [kworker/0:0H] 6 root 0:00 [kworker/u:0] 7 root 0:00 [kworker/u:0H] 8 root 0:00 [migration/0] 9 root 0:00 [rcu_bh] 10 root 0:26 [rcu_sched] 11 root 0:00 [ksoftirqd/4] 12 root 0:02 [migration/4] 13 root 0:00 [kworker/4:0] 14 root 0:00 [kworker/4:0H] 15 root 0:00 [ksoftirqd/8] 16 root 0:00 [migration/8] 17 root 0:00 [kworker/8:0] 18 root 0:00 [kworker/8:0H] 19 root 0:00 [ksoftirqd/12] 20 root 0:00 [migration/12] 21 root 0:00 [kworker/12:0] 22 root 0:00 [kworker/12:0H] 23 root 0:00 [cpuset] 24 root 0:00 [khelper] 25 root 0:00 [netns] 26 root 0:00 [bdi-default] 27 root 0:00 [kblockd] 28 root 0:00 [rpciod] 29 root 0:06 [kworker/12:1] 30 root 0:01 [kswapd0] 31 root 0:00 [kswapd1] 32 root 0:00 [kswapd2] 33 root 0:00 [kswapd3] 34 root 0:00 [nfsiod] 35 root 0:00 [mlx4] 36 root 0:06 [kworker/8:1] 37 root 0:08 [kworker/4:1] 38 root 0:00 [kworker/0:1] 39 root 0:00 [ib_mcast] 40 root 0:00 [ib_cm] 41 root 0:00 [iw_cm_wq] 42 root 0:00 [ib_addr] 43 root 0:00 [rdma_cm] 44 root 0:00 [mthca_catas] 45 root 0:00 [mlx4_ib] 46 root 0:00 [mlx4_ib_mcg] 47 root 0:00 [ib_mad1] 48 root 0:00 [deferwq] 49 root 0:00 [kworker/u:1] 50 root 0:00 {rcS} /bin/sh /etc/init.d/rcS 120 root 0:00 /usr/sbin/telnetd 127 root 0:02 /usr/sw-mpp/sbin/rmsd_100p_std 130 root 0:00 /usr/sw-mpp/sbin/swres -c 135 root 0:00 sh /sbin/start_online1.sh 136 root 0:00 /sbin/ntpd -p *** -Nn 148 root 1:13 /usr/local/sbin/lwfs -f /etc/lwfs/lwfs.vol -l /dev/shm/lw 187 root 0:00 sh 197 root 0:41 /sbin/sotailf_brief -t *** -p *** /dev/shm/lwfs_onl 206 root 0:32 [kworker/0:2] 3406 root 0:00 sleep 600 3413 root 0:00 [flush-0:14] 3420 root 0:00 /usr/sw-mpp/sbin/taskstarter -jobid 49200448 -rh mn005 -r 3421 * 0:00 /bin/ps -ef
顺带放出来cpuinfo和meminfo的信息
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 cpu : SW_64 cpu model : SW5 cpu variation : 1 cpu revision : 0 cpu serial number : system type : shenwei system variation : 0 system revision : 0 system serial number : cycle frequency [Hz] : 1450000000 timer frequency [Hz] : 0.24 page size [bytes] : 8192 phys. address bits : 44 max. addr. space # : 255 BogoMIPS : 0.81 kernel unaligned acc : 0 (pc=0,va=0) user unaligned acc : 88465541 (pc=4ff0423298,va=5000281dcc) platform string : N/A cpus detected : 4 cpus active : 4 cpu active mask : 0000000000001111 cpus core_start : 000000000000000f mem cycle freq : 500 L1 Icache : 64K, 2-way, 64b line L1 Dcache : 64K, 2-way, 64b line L2 cache : n/a L3 cache : n/a
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 MemTotal: 2038472 kB MemFree: 582544 kB Buffers: 0 kB Cached: 1199688 kB SwapCached: 0 kB Active: 830256 kB Inactive: 429920 kB Active(anon): 133568 kB Inactive(anon): 1120 kB Active(file): 696688 kB Inactive(file): 428800 kB Unevictable: 70912 kB Mlocked: 2904 kB SwapTotal: 0 kB SwapFree: 0 kB Dirty: 0 kB Writeback: 0 kB AnonPages: 131376 kB Mapped: 4224 kB Shmem: 3888 kB Slab: 22480 kB SReclaimable: 6392 kB SUnreclaim: 16088 kB KernelStack: 1808 kB PageTables: 352 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 1019232 kB Committed_AS: 201832 kB VmallocTotal: 8388608 kB VmallocUsed: 12208 kB VmallocChunk: 8376400 kB AnonHugePages: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 8192 kB ======================= cg0 ===================== UserPages_Mem_size: 8192 MB UserPages_Conti_Total: 7680 MB UserPages_Conti_Free: 7680 MB UserPages_Conti_Used: 0 MB UserPages_Cross_Size: 0 MB ======================= cg1 ===================== UserPages_Mem_size: 8192 MB UserPages_Conti_Total: 7680 MB UserPages_Conti_Free: 7680 MB UserPages_Conti_Used: 0 MB UserPages_Cross_Size: 0 MB ======================= cg2 ===================== UserPages_Mem_size: 8192 MB UserPages_Conti_Total: 7680 MB UserPages_Conti_Free: 7680 MB UserPages_Conti_Used: 0 MB UserPages_Cross_Size: 0 MB ======================= cg3 ===================== UserPages_Mem_size: 8192 MB UserPages_Conti_Total: 7680 MB UserPages_Conti_Free: 7680 MB UserPages_Conti_Used: 0 MB UserPages_Cross_Size: 0 MB
Runtime相关
从核
从核可以直接调用C语言的函数,这点比CUDA Kernel强不少。一个简单的测试程序,发现printf
,rand
,memset
函数可以正常调用。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 #include <stdio.h> #include <athread.h> extern void SLAVE_FUN (cpe_func) () ;int main () { printf ("Hello world from MPE.\n" ); athread_init(); athread_spawn(cpe_func, NULL ); athread_join(); return 0 ; } #include <stdio.h> #include <stdlib.h> #include <string.h> #include <time.h> #include "slave.h" void cpe_func () { int thread_id; thread_id = athread_get_id(-1 ); srand(time(NULL )); int arr[10 ], i; for (i=0 ; i<10 ; ++i) { arr[i] = rand(); } memset (&arr[5 ], 0 , sizeof (int )*3 ); if (thread_id==0 ) { printf ("Hello World from CPE. Generating Array:\n" ); for (i=0 ; i<10 ; ++i) { printf ("%d " , arr[i]); } printf ("\n" ); } }
奇怪的问题
函数命名问题
函数名字不要以slave_
开头,否则会引发undefined reference to slave_slave_***
的错误。从核函数在编译过程中会被重命名为slave_
加原名的函数。从编译器内置的一些宏可以看出来这点。
1 2 #define SLAVE_FUN(x) slave_##x #define athread_spawn(y,z) __real_athread_spawn(slave_##y,z)