DPINTF in commMonitor modified but not working

Hi,
I have modified a DPRINTF in comm_monitor.cc file under /build/ARM/mem this way:

DPRINTF(CommMonitor, "Forwarded read request\n");        =======>
DPRINTF(CommMonitor, "Forwarded read request %s \n", pkt->getAddr());

just like a previous post here:


but the trace output file still is printed like it was before(nothing has been added). below is a sample of the output file:

11000: system.monitor2: Forwarded read request
66250: system.monitor2: Latency: 55250
66250: system.monitor2: Received read response
108000: system.monitor2: Forwarded read request
.........................

Am i doing something wrong?
Regards,
Farshid


_______________________________________________
gem5-users mailing list
gem5-users <at> gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
n26001482 via gem5-users | 31 Oct 14:28 2014

how and where to start GEM5

Hi, all.

I've run GEM5 successful either SE mode or FS mode with an OS.

I want to know how the whole CPUs work and furthermore modify something so that the CPUs could walk the way I want.

Or I want to add some additional hardware that could works with the original CPUs within GEM5.

But I have no idea how and where to start. Should I start to study the source code of GEM5?

I appreciate if anyone gives me some hints or suggestions.

BEST. 
M.Y.

_______________________________________________
gem5-users mailing list
gem5-users <at> gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Ahmad Hassan via gem5-users | 31 Oct 12:33 2014

Re: MMAP a file in SE mode

Hi Steve,

I am running x96 SE mode. The writeBlob() works fine for very small test application. For real benchmark with 1GB working set, the simulation ends with exception:

panic: Tried to read unmapped address
0x2800000002d773b0.
0x2aaaaaaab000ULL
  <at> tick 771687885000
[invoke:build/X86/arch/x86/faults.cc, line 160]
Memory Usage: 11788528 KBytes
Program aborted at tick 771687885000

Any ideas why 0x2800 range is getting problems by writeBlob?

Thanks.


On 7 October 2014 15:20, Steve Reinhardt <stever <at> gmail.com> wrote:
We have a patch internally that implements more of mmap(), but unfortunately it's not quite ready to post.

If you just want to do a read mapping (you don't care if writes to the mmap'd region get written back to disk), and you don't mind just reading the whole mmap region in up front (which you need to do, since SE mode doesn't support page faulting), it's not too hard; just call p->allocateMem() to allocate the memory in the simulated process, and then read the data out of the file and use writeBlob() to copy it into the memory you just allocated.

Steve

On Tue, Oct 7, 2014 at 6:14 AM, Ahmad Hassan via gem5-users <gem5-users <at> gem5.org> wrote:
Hi,

The existing implementation in GEM5 SE mode only supports MMAP to /dev/zero. Has anyone implemented MMAP in gem5 that can map a file from the disk? If not, how can I extend this?

Regards,

_______________________________________________
gem5-users mailing list
gem5-users <at> gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users


_______________________________________________
gem5-users mailing list
gem5-users <at> gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: Fullsystem with NoC

Hi Babak,

Actually, I understand the usage of Mesh topology. But my goal is to group 4 processors using crossbar, or connecting them to a single router. That is, if I have 32 cores, making groups of 4, I need 8 routers.

It seems like a symmetric processor, which has 8 processing units connected by the NoC (8 routers then) with 4 cores each, (connected by a shared L2 with simple busses, for example).

Thank you all!

Atenciosamente,
Matheus Alcântara Souza

(Enviado por dispositivo móvel)

Em 31/10/2014, às 06:04, babak aghaei <babak_aghaeii <at> yahoo.com> escreveu:

 Hi
sorry for delay. your email had gone to Spam and I didn't check out, clearly i don't understand your issue, but if you want have a 32 multi-core platform with crossbar, simply, this is enough you write -n 32 in command line. but if you want in Mesh topo you must modify the mesh rows and mesh dirs, for example: for 32 cores : --mesh-rows=4 --num-dirs=32. the simulator dived the n to rows and get the columns..
Best
 
---------------------------------------------------------------
Babak Aghaei
Ph.D candidate

From: Matheus Alcântara Souza <ticksmas <at> gmail.com>
To: babak aghaei <babak_aghaeii <at> yahoo.com>
Cc: gem5 users mailing list <gem5-users <at> gem5.org>
Sent: Tuesday, October 14, 2014 3:22 AM
Subject: Re: [gem5-users] Fullsystem with NoC

Thank you sir! With a refresh in everything, Garnet os working.

I wonder now if the topology Cluster might work. Suppose i have a 4core "chip" with crossbar interconnection, and 8 "chips" connected through a Mesh NoC (2x4). Thus, 32 cores.

Any tips?



Atenciosamente,
Matheus Alcântara Souza
(Via iPhone)

Em 11/10/2014, às 17:20, babak aghaei <babak_aghaeii <at> yahoo.com> escreveu:



Hi
this is possible, befor you must establish the garnet network and then run any benchmark on it.
best 
---------------------------------------------------------------
Babak Aghaei
Ph.D candidate

From: Matheus Alcântara Souza via gem5-users <gem5-users <at> gem5.org>
To: "gem5-users <at> gem5.org" <gem5-users <at> gem5.org>
Sent: Saturday, October 11, 2014 11:27 PM
Subject: [gem5-users] Fullsystem with NoC

Dear all,

I've been reading the gem5 list for quite some time, with the goal of know how to run applications (such as PARSEC ones) in fullsystem mode,  over a network-on-chip architecture.

I concluded that this is not possible nowadays. So I wonder if I am wrong? If yes, what should I do to run this?

If I'm right, what should be the first thing to check/change to make this possible? Maybe the messages generator should be adapted, as well the Ruby Memory protocols.

Thank you all!

Atenciosamente,
Matheus Alcântara Souza
(Via iPhone)
_______________________________________________
gem5-users mailing list
gem5-users <at> gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users



_______________________________________________
gem5-users mailing list
gem5-users <at> gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

System calls query in se mode

Hi,

I am trying to understand the system call "mremapFunc" in se mode. I am unable to understand how is system call made. I understand that there is no direct call to this function. Rather it is called using doSyscall using call parameters and call number, but am unable to track the exact location of the "mremapFunc" system call. By exact location I mean, where are the system call name and arguments set. I also tried to search "setSyscallArg" but again could not figure out how this particular system call is made. Understanding this call by debugging is also difficult because the call is made infrequently. I also tried to debug using "gem5.debug" using flag "SyscallVerbose", but with less luck. Can anyone please help to understand this or let me know if I want to make this system call after few cycles, how to do so.

Thanks,
Debiprasanna Sahoo

_______________________________________________
gem5-users mailing list
gem5-users <at> gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Assertion `!delayedResponse' failed, In X86 FS simulation

Hello every one,

I'm running some map-reduce benchmarks on X86 FS Dual mode. I can run the benchmark with atomic simple cpu, but when I use O3 cpu, after simulating around 3 seconds, I get the following error . Also I should mention that I've hardcoded PCI accesses to be uncacheable to be able to start detailed simulation Re: [gem5-dev] Ethernet device doesn't work with O3 cpu model in X86 ISA.
Any help?

Thank you,
Mohammad

switching cpus
**** REAL SIMULATION ****
warn: instruction 'prefetch_nta' unimplemented
warn: instruction 'prefetch_nta' unimplemented
warn: instruction 'prefetch_nta' unimplemented
warn: instruction 'prefetch_nta' unimplemented
warn: instruction 'prefetch_nta' unimplemented
warn: instruction 'prefetch_nta' unimplemented
warn: instruction 'prefetch_nta' unimplemented
warn: instruction 'prefetch_nta' unimplemented
warn: instruction 'prefetch_nta' unimplemented
warn: instruction 'prefetch_nta' unimplemented
warn: instruction 'prefetch_nta' unimplemented
warn: Tried to clear PCI interrupt 10
warn: instruction 'fild' unimplemented
warn: instruction 'fistp' unimplemented
warn: instruction 'fucomi' unimplemented
warn: instruction 'fsubrp' unimplemented
warn: instruction 'fistp' unimplemented
warn: instruction 'fistp' unimplemented
warn: instruction 'prefetch_nta' unimplemented
warn: instruction 'prefetch_nta' unimplemented
warn: instruction 'clflush' unimplemented
warn: instruction 'fcmovne' unimplemented
warn: instruction 'fucomip' unimplemented
warn: instruction 'prefetch_nta' unimplemented
warn: instruction 'prefetch_nta' unimplemented
warn: instruction 'prefetch_nta' unimplemented
warn: instruction 'prefetch_nta' unimplemented
warn: instruction 'prefetch_nta' unimplemented
warn: instruction 'prefetch_nta' unimplemented
warn: instruction 'prefetch_nta' unimplemented
warn: instruction 'prefetch_nta' unimplemented
warn: x86 cpuid: unimplemented function 4
gem5.opt: build/X86/arch/x86/pagetable_walker.cc:630: bool X86ISA::Walker::WalkerState::recvPacket(PacketPtr): Assertion `!delayedResponse' failed.


_______________________________________________
gem5-users mailing list
gem5-users <at> gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Installing OpenCV on ARM Image and running it on gem5

Hi,

I was wondering if anyone has tried to install opencv on the arm image (armv7) provided in the repository?

I was able to install all the dependencies for opencv 2.4.5 but when I do a make (after chroot'ing into the image), I get an error
"virtual memory exhausted: cannot allocate more memory".

I tried cross compiling, however the cross compiler does not detect the ffmpeg lib. I modify the following variables
  • LD_LIBRARY_PATH
  • C_INCLUDE_PATH
  • CPLUS_INCLUDE_PATH
  • PKG_CONFIG_PATH
  • PKG_CONFIG_LIBDIR
  • PATH
  • CMAKE_LIBRARY_PATH
  • CMAKE_INCLUDE_PATH

and point the cmake to the ffmpeg libraries compiled for armv7 architecture. However, it does not detect it and my cmake output is -

-- Video I/O: -- DC1394 1.x: NO -- DC1394 2.x: NO -- FFMPEG: **NO** -- codec: NO -- format: NO -- util: NO -- swscale: NO -- gentoo-style: YES -- GStreamer: NO -- OpenNI: NO -- OpenNI PrimeSensor Modules: NO -- PvAPI: NO -- GigEVisionSDK: NO -- UniCap: NO -- UniCap ucil: NO -- V4L/V4L2: NO/YES -- XIMEA: NO -- Xine: NO My host machine is a 64 bit x86 machine.

Regards,
Urmish
_______________________________________________
gem5-users mailing list
gem5-users <at> gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: Dear Joel, Could you please help me with a problem on Parsec benchmarks?

Hi Hao Sun,
  
3. The problem is the simulation showed Killed after 20+hours, I am not sure where is the problem. But when I run single benchmark, there is no problem. So
     i. can I use export GOMP_CPU_AFFINITY="0-7 8-15" to set cpu affinity? I am not sure which benchmarks support OpenMP, or could you tell me the right way to bind different benchmark to different cpu?

Unfortunately, the pre-compiled PARSEC benchmarks on the disk image use Pthreads for parallelization rather than OpenMP, and they were not compiled to set CPU thread affinity. The affinity environment variable you're exporting won't affect these benchmarks.

Given the complexity of what you're trying to simulate, it is likely you'll need to modify either the disk image or the benchmarks themselves. You may want to familiarize yourself with that process as described in our tech report: http://www.cs.utexas.edu/~parsec_m5/TR-09-32.pdf


     ii. I am trying to issue 16 threads (8 for blackschole, 8 for bodytrack), so is my rcS file right? 

Technically, yes, your rcS file will run the benchmarks concurrently and with the expected number of threads. However, running a multithreaded AND multiprocess workload like this is going to be very tricky for a couple reasons:

  First, note that when you use the Linux terminal command '&', the last command is the one that gates the progress of the terminal thread. This means you can get race conditions and the simulation can exit by falling through to the '/sbin/m5 exit' before some of the applications have completed. Based on your rcS file and the behavior you're seeing, I suspect you may be running into this problem. I'd encourage you to play around with the following toy example in a standard bash terminal to see what I mean:

  % (sleep 2; echo "first") & (sleep 1; echo "second")


  Second, it seems you may be trying to not just get the benchmarks to run concurrently, but to also get their regions of interest (ROIs) to run concurrently. This is an even trickier problem to address than the process race problem. Depending on the way each benchmark works, it may take a varying amount of time for the control threads to set up the work and launch the worker threads, which means that one benchmark may actually complete its ROI before the other one even starts the ROI. If in fact you are trying to get the ROIs to run concurrently, you will probably need to do one or a couple things:
   A) You can extend the benchmark ROIs by increasing input set sizes or by modifying the benchmarks to loop over the ROI multiple times (the latter is often used in contention management papers when measuring workload throughput).
   B) Another option is to do some sophisticated, delayed benchmark launching. Here's an example rcS file snippet that would do that:

----------------------------------------------
#
# First, ensure that the first benchmark doesn't need to complete or the
# second benchmark runs longer than the first.
# Second, start the benchmark that takes longer to get to ROI
# Third, delay for roughly the difference in to-ROI run time
# Fourth, launch the second benchmark
# NOTE 1: Comments in an rcS can affect control process timing
# NOTE 2: The sleep command is pretty imprecise (by up to ms)
# NOTE 3: x264 runs longer than bandwidth_bench, but bandwidth_bench
#               takes 0.03s longer to get to its ROI
#
/sbin/m5 dumpresetstats
./bandwidth_bench &
sleep 0.03
parsec/install/bin/x264 <params>
/sbin/m5 dumpresetstats
echo "Done :D"
----------------------------------------------

   You'd need to run these benchmarks in isolation to collect the time it takes them to get to the ROI. It is also likely that you'd encounter some hairy non-determinism in the run times, especially if there may be contention for shared resources.

  Hope this helps,
  Joel



On Tue, Oct 28, 2014 at 10:59 AM, Hao Sun <haosun2014 <at> u.northwestern.edu> wrote:
Dear Joel Hestness,

Sorry to bother you and I am really need your help. I am trying to run 2 different parsec benchmarks on 2 groups of cpus, eg, totally 16 cpus, the first 8 cpus running blackscholes, and the other 8 cpus running bodytrack benchmark. I am running in the full system mode. I use the pre-compile image file from http://www.cs.utexas.edu/~parsec_m5/

1. My gem5 command is:
./build/ALPHA_FS/gem5.opt  ./configs/example/fs.py -n 16 --script=./configs/boot/runScript/blackscholes_bodytrack_8_8.rcS

2. The corresponding rcS file is:

#!/bin/sh

# File to run the blackscholes benchmark and bodytrack

export GOMP_CPU_AFFINITY="0-7 8-15"
cd /parsec/install/bin
/sbin/m5 dumpstats
/sbin/m5 resetstats
./blackscholes 8 /parsec/install/inputs/blackscholes/in_4K.txt /parsec/install/inputs/blackscholes/prices.txt & ./bodytrack /parsec/install/inputs/bodytrack/sequenceB_1 4 1 1000 5 0 8
echo "Done :D"
/sbin/m5 exit
/sbin/m5 exit

3. The problem is the simulation showed Killed after 20+hours, I am not sure where is the problem. But when I run single benchmark, there is no problem. So
     i. can I use export GOMP_CPU_AFFINITY="0-7 8-15" to set cpu affinity? I am not sure which benchmarks support OpenMP, or could you tell me the right way to bind different benchmark to different cpu?
     ii. I am trying to issue 16 threads (8 for blackschole, 8 for bodytrack), so is my rcS file right?
 
Thanks for your time to read me email! I really need your help, I have been stuck at this problem for 3 weeks. Thanks in advance!

Best regards,
Hao Sun
Northwestern University



--
  Joel Hestness
  PhD Student, Computer Architecture
  Dept. of Computer Science, University of Wisconsin - Madison
  http://pages.cs.wisc.edu/~hestness/

_______________________________________________
gem5-users mailing list
gem5-users <at> gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Important help needed

Does anyone how to install manually an operating system(in my case Tizen) on the gem5 simulator?

--
Best Regards,
Anmol Mohanty

_______________________________________________
gem5-users mailing list
gem5-users <at> gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Phys reg 13 to 42 with no access!

Hi guys,

I'm looking at integer register values/accesses at the Physical Register 
File. Running few applications (from Spec2006), it shows no accesses for 
reg#13 to reg#42 (in some cases the range is from reg#15 to reg#33). I 
was wondering whether anyone knows if these registers are reserved for 
specific conditions or interrupts, which would be used at FS mode.
I'm using ARM platform, and running in the SE mode.

Thanks,
Negar

--

-- 
Negar Miralaei
http://www.cl.cam.ac.uk/~nm537/

Re: Seg Fault with Multi-channel Memory

---------- Forwarded message ----------
From: "Erfan Azarkhish" <erfan.azarkhish <at> unibo.it>
Date: Oct 29, 2014 10:15 AM
Subject: Re: [gem5-users] Seg Fault with Multi-channel Memory
To: "Patrick L." <plafratt <at> gmail.com>, "gem5 users" <gem5-users <at> gem5.org>
Cc:

Hi Patrik,
Are you using gem5-stable? Because support for multichannel memory is still in gem5-mirror.
I have seen this issue and i dont know the solution, but in my case I was able to simply comment the code which was causing the exception (it was just a check statement). Now it is working fine, but the issue still remains.

On Oct 28, 2014 11:21 PM, "Patrick L. via gem5-users" <gem5-users <at> gem5.org> wrote:
I am trying to simulate multi-channel memory, and I'm receiving a seg
fault:

0x0000000000445476 in System::cacheLineSize (this=0x0) at
build/X86/sim/system.hh:185

It appears that the System object referenced by the DRAMCtrl object is not
getting constructed properly.

I came across the previous post below on multi-channel memory, which
suggests that such configurations should work:

http://comments.gmane.org/gmane.comp.emulators.m5.users/16608

Has anyone seen this seg fault before? If so, do you know if I am doing
something wrong?

Any help is appreciated.

Thanks,
Patrick

_______________________________________________
gem5-users mailing list
gem5-users <at> gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
gem5-users <at> gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Gmane